Situations like this remind me just how important it is to have open data standards and open software to back our open access scientific work. PLoS seems to have failed on that front over and over again. Despite their failure in that area, I still consider PLoS among the very best of the open access venues. Andrew Tritt, myself, and some other coauthors submitted a paper to PLoS One last week. They’re on a new manuscript management system which, like the last one, is closed-source and even more strictly enforces the use of closed-source proprietary software for manuscript preparation. The good news is that P1 now accepts LaTeX (which at least is open although is far from a well-defined standard!). The bad news is vector graphics can’t be submitted unless they have been processed by Acrobat Distiller. We spent hours trying to tweak our PDF and EPS figures to sneak through their system. Even Adobe Illustrator EPS and PDF wouldn’t make it through. Has anybody else made this work using open-source gadgetry? Alas, for now PLoS has taken one step forward, two steps back.
Fortunately, PLoS and Google look set to turn over a new leaf with Annotum. They’ve got a github site with source code, and in my book that’s a great way to start. I hope they get lots of creative and useful code contributions from the very scientists who will use the publishing platform. And I hope they can keep the public version in-line with any private branch of the code that PLoS may run so that we the scientists have the tools to serve our own (published in PLoS) content!
]]>For those of you working in some aspect of genomics, I encourage you to submit a paper to the conference. RECOMB-CG is the premier venue for computational studies of genome rearrangement, genome structural variation, and their evolutionary patterns. Three of my favorite topics. It also deals in a host of other comparative genomics topics.
The conference web site and submission information is at http://recombcg.org
]]>Somewhat paradoxically, the reviewers who spend countless hours to write high quality reviews go largely unrewarded. They are unpaid volunteers, and the cultural practice of anonymous peer review disconnects the reviewer from receiving credit from the rest of the scientific community. The only person who ever has the opportunity to associate the reviewer’s name to the review is the journal editor and many journal editors, especially those at high profile journals, are professionals who do not participate directly in academic research (or in faculty hiring committees). One could maybe argue that high quality reviews lead to high quality science, and so there is some very real society-level benefit that trickles down to the reviewer herself. Maybe.
The culture of anonymous peer review seems totally perverse and broken to me. I’ve heard the argument that anonymity can be important for the situation where Professor AlphaChimp attempts to publish work with severe deficiencies. When Mr. OmegaNerd points those deficiencies out in the manuscript review he can do so without worry of getting bullied by AlphaChimp by virtue of the veil of anonymity. But that same veil prevents the rest of the scientific community from praising Mr. OmegaNerd for his excellent detective work and for having the gumption to communicate about the problems without restraint and with clarity.
Fortunately I think the situation outlined above is less common than many proponents of anonymous peer review would like us to believe. By the time of journal submission many papers are well reasoned. A more common occurrence is that the reviewer, possibly in part due to the lack of incentive to read the paper carefully and write a quality review, writes a careless negative review that prevents the paper from getting published. But probably most common of all is that the reviewers usually provide thoughtful, balanced, and helpful reviews.
While publishing some of my own work recently I found myself with some unusually well thought-out reviews. The comments made by the reviewers would certainly be of interest to future readers of the manuscript. Even though they were anonymous, I wanted to publish the peer reviews along with the paper here on my blog or as comments on the journal’s web site. But after inviting the opinions of several colleagues I decided not to do it. Everyone thought publishing the anonymous reviews was a bad idea, mainly because the reviewers wrote with the expectation that their comments would remain private to the editor, themselves, and the authors. But it also brought my attention to the legal question of copyright on anonymous peer reviews:
Can an author legally publish someone’s anonymous peer review? Do the anonymous reviewers hold an implied copyright on their reviews? If so, and if one were published against a reviewer’s will, would that reviewer have to compromise anonymity to request the removal of a published review? (and induce the streisand effect in doing so)
Even though I wanted to publish the anonymous reviews it seemed unwise to publish them along with the work itself. Had I been able to contact the reviewers to ask permission they might have agreed! But anonymity prevented me from (conveniently) contacting them and the reviewers did not indicate whether I had permission to republish their reviews.
So from now on I have decided to place a copyright statement on all of my reviews to make it abundantly clear to the authors whether they have permission to republish the review. The license I have chosen for my reviews is the Creative Commons Attribution No Derivatives license. That license will allow manuscript authors to publish my review along with their work, so long as the review is published in unmodified form. The attribution aspect implies that my name will be attached to the review. I think that imparts a level of honesty in the review process, and I sign my reviews even if they are negative. I understand not all reviewers share my views on that matter, but even if written anonymously, I think it would be wonderful if reviewers could indicate their willingness for the review to be shared by attaching a copyright licensing statement to their review.
Hear ye reviewers, go forth and write high quality reviews, and let them be published so you can receive credit for your hard work!
It may be possible with some legal gymnastics to attach an enforceable copyright license to an anonymous peer review, but I think that misses the point. Our reviews should be published to encourage high quality review and to provide credit to the reviewers who play an important role in advancing the state of the art in science. Some people argue that such a practice will merely lead to “old boys clubs” or networks of peers who always provide positive reviews of each other’s work and thus compromise the quality of science. But such social networks form whether peer review is anonymous or not. If reviews and reviewer’s names were published then at least there would be some greater transparency and accountability in the social networks. Maybe reviewers should insist on the right to publish their review in exchange for their hard work?
]]>———————————-
rachel whitaker
crispr repeats — history of host-viral interaction
———————————-
200 of 1300 spacers match known virii
blast spacers against viral genomes
more similarity to viral genomes from geographically local places
strains from same location have spacers matching same virii (geographic correlation)
more matches to SIRV virus in north american populations
vice versa for SSV/kamchakta
shifting gears to new topic:
39 strains from one hot springs.
predator prey oscillations?
most of the time there should be a single dominant strain.
MLST of 39 strains
appears to be one dominant clone
small number of coexisting strains branching deeper in MLST tree
look at CRISPRs in 12 genomes sequenced from these isolates
3 CRISPR/Cas loci. hypervariable.
tried to amplify with primers to sequence but only works 60% of time
lots of rearrangements in the crispr region, preventing amplification
C locus deletions in some strains
possible recombination in CRISPRs
need an alternative model:
do CRISPRs maintain host population diversity?
notion is that CRISPRs match different parts of the virus so an escape mutation in the virus will only affect part of the host population
Question by David Ward: can the dominant population be subdivided into several different subpopulations?
A: yes, the dominant strains actually group into three groups based on genome-wide similarity
Qeustion by Matt Kane: how does this relate to the rate of genetic communication? are the virii driving local speciation and diversification?
————————-
Vibrio paracholerae/cholerae genome conversion
————————-
paracholerae sister taxa to cholerae. not usually human pathogenic.
isolates from bangladesh and wood’s hole.
phylogeny of strains using 6 marker genes shows a mix of topology
look at integron for markers. variable part of the genome, 3%.
attI shuffling locus for homologous recombination-mediated rearrangement of integron genes
integron cassette phylogeny clusters by geography, not by species (as defined by housekeeping genes).
integrons are highly variable in gene content, the integron gene pool families are not part of the main vibrio gene pool.
that is: integron genes are always in integron regions, never in the rest of the genome.
annotated gene functions in integrons appear to be “ecologically relevant”?
—————————
David Johnson
Why does cross-feeding occur in Microbial communities?
—————————
Why are microbial communities so diverse, and how is the diversity maintained?
How are communities assembled?
Substrate cross-feeding: does it maintain diversity?
Studying denitrification .. Nitrate to N2
Some bugs do total denitrification, other bugs only have part of the pathway and assemble into communities that create a complete pathway.
Why don’t bugs with the complete pathway do better?
Mathematical model of fitness of each approach to denitrification.
assumptions: cell has a limited number of enzymes
convex constraint function creates cross-feeding, concave constraint function creates complete pathway organisms
experimental evaluation
nitrogen mutants of Pseudomonas stutzeri, measure the shape of the constraint function by assembling communities of mutants where each community member carries out part of the pathway
Had to delete the complete transformation system in order to get experiment to work.
———————————
Pathogens
———————————
pathogen: any organism that reduces the fitness of a host
bacteriophage: strongly select for phage resistance
- may indirectly select for less pathogenic bacteria
investigated selection by parasitic phages and thermal environments on evolution of bacterial pathogenicity
microcosm experiment with serratia marescens
bacteria cultured with lytic phage PPV in low and high temp
changes in pathogenicity measured in vivo using wood tiger moth larvae
temp and phage treatment had little effect on bacterial growth traits
previous phage exposure allowed larger population sizes in subsequent exposure
S. marescens becomes more pathogenic when cultured at 37c without phage, but not with phage.
is it due to reduced maximum population size caused by phage??
motility maladaptive in presence of phage, but good for pathogenicity?
———————-
cell-to-cell electron transfer in geobacter
Zarath Summers
———————–
interspecies hydrogen transfer — one organism exports H2 other exports H+ to reduce CO2 to CH4
Geobacter, non fermenters, iron citrate respiration
Pelobacter, syntrophic
create cocultures of sulfurreducesn and metallireducens
syntrophic when growing on ethanol fumarate media.
metallireducens oxidizes ethanol, sulfurreduces consumes H2+acetate, reduces fumarate.
grew slowly at first.
serial transfer to select for fast growers on the media.
large aggregates emerged!
1mm in size. large spherical structures by transfer #30.
channels in aggregates, possibly to promote exchange with the environment around them
One organism exists in 15% abundance, other is 85%.
spatial arrangement determined by FISH, nice mix of organisms.
solexa resequencing from transfer 15. found single mutation in pilR regulator that causes frameshift and premature stop codon.
possible up-regulation of OmcS.
conductive pili — proposed mechanism of syntrophy.
interconnected web of pili in the aggregate, electrons go thru pili from metallireducens to sulfurreducens
test conductance using 2 plate gold with aggregates in gap between plates. yep, conducts electricity.
very cool!
Q: able to separate the species from aggregates?
A: no
Q: can the organism abundances be explained metabolically?
A: the metallireducens (electron donor) organism is in lower abundance, not sure yet about metabolism.
That will all have to wait, because today I’m posting my notes from the ISME 13 conference. ISME is an acronym for International Society for Microbial Ecology and the society meeting occurs every two years, this time in Seattle. I wasn’t planning to attend ISME, but Jenna Morgan has a poster here for work on which I’m a coauthor and she couldn’t make it so I’m presenting the work on her behalf. Last time I attended the meeting in Cairns I thoroughly enjoyed it, and yesterday’s talks and posters did not disappoint either.
Without further ado, here are some notes…
————————–
AmpliconNoise
————————–
I only caught about a third of this one.
Rare biosphere seems to be a sequencing artifact created by protocol noise and the 3% OTU cutoff
Open source software method to clean up 16S pyrotag data:
http://code.google.com/p/ampliconnoise/
YAY! open source
Substantially reduces inferred OTU counts, makes that pesky rare biosphere disappear.
———————————
Tal Dagan talk on Lateral Gene Transfer
———————————
60% of genes have lgt
analyzed about 650 genomes
directed network gains/losses
origins inferred using codon bias
non-homologous end joining proteins
200 genomes encode both subunits
nhej positive organisms appear to acquire from more distant organisms
———————————
widespread HR in Streptomyces
Daniel H Buckley
———————————
defines LGT to be allelic replacement
Fraser 2006 rates decline log-linearly with seq similarity based on MLSA analysis
some organisms very clonal, others recombining wildly at the tips of the tree
now let’s look at wild isolates
Streptomyces soil organism, wet dirt smells like Streptomyces B.O.
Half of antibiotics coming from Streptomyces
Sporulating — exospore. hyphae grows from spore, dna replicates during hyphae growth. hyphae form mycelium.
Streptomyces can do dsDNA exchange. depends on traB, atp dependent dsDNA translocator.
traB localizes at tips of hyphae, speculates traB might mobilize entire chromosomes during hyphae fusions
Someone designed MLST for Strepto
Six loci, not a single pair of congruent trees.
Used RDP to infer some recombination events. Signal looks a bit blurry.
ClonalFrame tree of Streptomyces trees
> 40% of genes impacted by HGT
HGT gene pairs have average 6% divergence — events are probably old.
did some sampling of S. flavogriseus pratensis from around eastern half of US.
samples are 99.8% identical on average
within “species” rho/theta = 27
cross species rho/theta < 1.
suggests cohesion
forward time simulation under a neutral model
estimates of divergence and rho/theta from Vos and Didelot approximately match the neutral speciation model predictions
Konstantinidis 2008 “valley of genetic discontinuity” — lack of reads recruiting at 86-92% identity
Fred Cohan asked a question or rather, made a statement about recombination driven cohesion and ecotypes. It seemed like he was suggesting that ecotypes exist only in the niche-specifying genes, and the recombining core genes don’t give the data necessary to identify the ecotype. So the ecotype model gets pushed further into the margins of the genome and we now have something Jeffrey Lawrence might call “ecotypes in pieces” (for the unfamiliar, that’s a reference to Lawrence’s species in pieces work).
————————–
Broad host range plasmids LGT in bacteria
————————–
How frequently do bacteria swap DNA and plasmids with each other?
What is the host range of particular plasmids?
How do those plasmids evolve over time?
Use a model plasmid: IncP-1
simplified the plasmid by removing various transfer genes
Use Shewanella oneidensis Mr-1 as a host
Ancestral plasmid unstable, but can naturalize into a new host over 1000 generations.
How did they adapt?
Can they adapted plasmids be moved to yet another new host? Not always.
many mutations localized in the replication protein TrfA1
evolved plasmid genes have copy-number amplification
conclusions: drug resistance plasmids can rapidly adapt to new hosts
single gene changes can facilitate host shifts
—————————————
GeneFish
—————————————
create a recombinant E. coli to capture environmental DNA
environmental DNA ends up on plasmid
Simplify metagenome protocol by getting environmental DNA directly into the cloning vector.
avoid DNA extraction bias, other biasing steps. (but isn’t cloning a bias??)
get env DNA into plasmid, select for it using colE3 relF toxin system, grow on plates.
selectable recombination seems to increase recomb frequency by 4-5 log units.
Now looking to test the system.
Use NarG gene. design homologous regions to narG gene for the plasmid
use 500bp homologous region
recombination frequency 10^-6
toxin selection effectiveness 25-50%?
tolerates 80-100% divergence from homologous region
not very good??
———————————————
Hinsby Cadillo, Sulfolobus sympatric species
disclaimer: I am a collaborator on this project
———————————————
S. islandicus genomes from the same location
39 strains from Mutnovsky formation, Kamchakta
MLSA some large clonal complexes, a few rare groups, some intermediate strains.
ClonalFrame infers presence of some recombination
12 genomes from M16 spring
2.285Mbp core genome, build a phylogeny based on core genome
ClonalOrigin analysis
higher than expected rates of exchange among closely related organisms, lower than expected exchange among more divergent organisms
geographically isolated populations do not recombine much at all
Recombination could be mediated by pili in S. solfataricus
respiratory nitrate reductase has a presence/absence pattern consistent with the phylogeny
gene content specified niche separation, within-niche cohesion by homologous recombination
]]>Last year when I was working at a research institute in australia, I found that whenever I needed to download a new version of the NCBI databases I would have to get them from NCBI’s ftp site in the USA. That may work well enough for people located stateside, but the transpacific pipes do not seem to treat blast databases as priority electrons (photons, whatev). Usually after a few days and several dropped TCP connections I would finally have the whole enchilada. Not pleasant, but with a little persistence it was possible.
Enter biotorrents. Now I can fire up my favorite bittorrent client and download the NCBI database from any of a number of globally distributed seeders, which should work much faster. That’s the dream anyway. In reality biotorrents is in early days and we need people to contribute bandwidth to the effort by seeding things like the NCBI databases. Ideally, our great leaders at NCBI, EBI, and elsewhere would take the initiative and contribute by seeding their own databases. David Lipman and Ewan Birney if you’re listening, consider this a public challenge!!
]]>This year I’m co-mentoring a project with Marc Suchard that aims to develop a small and reusable open source software library to calculate phylogenetic likelihoods using CPUs and GPUs. The project is part of this year’s Phyloinformatics Summer of Code which is being operated by the National Evolutionary Synthesis Center (NESCent) through Google’s program. Lots of students applied for the project, perhaps because GPU computing is trendy and computer geeks tend to be some of the trendiest people around (even if not always socially graceful!). Nonetheless there were many strong applicants and in the end, the successful student was Daniel Ayres, a Ph.D. student at UMD.
The project is now well underway, with all sorts of development activity by mentors, mentee, and other folks interested in the notion of a resuable library for phylogenetic likelihood models.
Thanks to Google’s charitable arm for supporting so many students and projects!
]]>Update: someone pointed me to the Topaz project, which looks promising!
I am currently preparing an article for submission to an open access journal (PLoS One, to be specific). I have just learned that PLoS One, like many other journals, requires all articles to be submitted in either .doc or .rtf format. But why do I care? My article was originally written in the open-source LaTeX system and intended as a conference contribution. The article deals heavily in math and statistics and makes use of LaTeX’s excellent equation typesetting abilities. As far as I can tell, it’s no simple matter to convert a LaTeX document with equations to M$ Word format.
How can it be that the leaders of the open-access journal movement require submissions in a closed and proprietary format? Didn’t the open-access journal movement draw at least some of its inspiration from the free software movement that predated it by at least 10 years? I presume the answer to this question lies at least partially with the proprietary nature of publishing and typesetting systems in common use at publishing houses. The good people at PLoS probably made a decision to purchase existing proprietary publishing software for their operation rather than investing in an alternative that supports open standards. And sadly, they now probably view change as too expensive.
To their credit, the topical PLoS journals do accept papers written with open-source software such as LaTeX, but that policy has only been in place recently. The editorial office converts LaTeX submissions on a case-by-case basis. Last year I published a paper authored in LaTeX in PLoS Genetics. While I was very happy that I didn’t have to do the conversion myself, I think that the PLoS approach (and that of other journals) essentially amounts to applying band-aids to a broken publishing system. It is not a good long term solution.
We need a scientific publishing system that is founded on open document standards and open source software. Viable alternatives such as OpenOffice exist, yet I can not rely on OpenOffice to save complex equations in Microsoft Word documents (it works fine in the native OpenOffice format). PLoS should lead the way in revolutionizing scientific publishing, and they should start on the inside by developing a publication process based on open standards. After five years of PLoS, why are we still without a viable open-source platform for scientific publishing?
In the meantime, I have to carefully consider whether it’s a more effective use of my time to painstakingly convert my document to Word and support the status quo, or whether I should instead spend that time adding content that would make my article appropriate for a journal that will accept LaTeX. Reformatting documents is mind-numbing, while submitting elsewhere might actually involve some interesting work.
]]>It was time to get a new laptop and I decided to find out why people are making so much noise about netbooks. After poking around a few reviews, I narrowed down my candidates to the 10″ Asus eeePC, the MSI Wind U100, and the Acer Aspire One. I finally ended with the MSI Wind based on its 160gb hard drive, the possibility to buy a 9 cell battery which will last 6+ hours, claims of solid build and a respectable keyboard, and supposedly little heat and little noise. The major downside of the Wind is that it only ships with Windoze, so yet again I was stuck paying MS tax.
The laptop arrived in the mail this week, and my first impression is that the reviews were generally spot on, except when it comes to noise. The MSI Wind has a fan and a 2.5 inch spinning-hunk-o-metal hard drive inside, both of which can make a raucus if you’re sitting in a seminar room. To put this in perspective, I’ve been using a Dell X1 for the past three years which has NO fan and uses a nearly silent 1.8″ hard drive. Of course the problem with the X1′s lack of a fan is that it can get quite toasty even when doing basic computing like web browsing. Why oh why did Transmeta have to die?
]]>