Archive for the ‘Uncategorized’ Category

ISME 13 day 2

Thursday, August 26th, 2010

Following on from my previous post, here are notes from day 2 of ISME.  This is the last of my ISME notes, I only attended the first half of the meeting.

———————————-
rachel whitaker
crispr repeats — history of host-viral interaction
———————————-

200 of 1300 spacers match known virii

blast spacers against viral genomes

more similarity to viral genomes from geographically local places

strains from same location have spacers matching same virii (geographic correlation)

more matches to SIRV virus in north american populations
vice versa for SSV/kamchakta

shifting gears to new topic:
39 strains from one hot springs.
predator prey oscillations?

most of the time there should be a single dominant strain.

MLST of 39 strains
appears to be one dominant clone
small number of coexisting strains branching deeper in MLST tree

look at CRISPRs in 12 genomes sequenced from these isolates

3 CRISPR/Cas loci.  hypervariable.
tried to amplify with primers to sequence but only works 60% of time

lots of rearrangements in the crispr region, preventing amplification

C locus deletions in some strains

possible recombination in CRISPRs

need an alternative model:
do CRISPRs maintain host population diversity?
notion is that CRISPRs match different parts of the virus so an escape mutation in the virus will only affect part of the host population

Question by David Ward: can the dominant population be subdivided into several different subpopulations?
A: yes, the dominant strains actually group into three groups based on genome-wide similarity

Qeustion by Matt Kane: how does this relate to the rate of genetic communication?  are the virii driving local speciation and diversification?

————————-
Vibrio paracholerae/cholerae genome conversion
————————-

paracholerae sister taxa to cholerae.  not usually human pathogenic.

isolates from bangladesh and wood’s hole.

phylogeny of strains using 6 marker genes shows a mix of topology

look at integron for markers.  variable part of the genome, 3%.

attI shuffling locus for homologous recombination-mediated rearrangement of integron genes

integron cassette phylogeny clusters by geography, not by species (as defined by housekeeping genes).

integrons are highly variable in gene content, the integron gene pool families are not part of the main vibrio gene pool.
that is: integron genes are always in integron regions, never in the rest of the genome.

annotated gene functions in integrons appear to be “ecologically relevant”?

—————————
David Johnson
Why does cross-feeding occur in Microbial communities?
—————————

Why are microbial communities so diverse, and how is the diversity maintained?

How are communities assembled?

Substrate cross-feeding: does it maintain diversity?

Studying denitrification ..  Nitrate to N2

Some bugs do total denitrification, other bugs only have part of the pathway and assemble into communities that create a complete pathway.

Why don’t bugs with the complete pathway do better?

Mathematical model of fitness of each approach to denitrification.

assumptions: cell has a limited number of enzymes

convex constraint function creates cross-feeding, concave constraint function creates complete pathway organisms

experimental evaluation

nitrogen mutants of Pseudomonas stutzeri, measure the shape of the constraint function by assembling communities of mutants where each community member carries out part of the pathway

Had to delete the complete transformation system in order to get experiment to work.

———————————
Pathogens
———————————

pathogen: any organism that reduces the fitness of a host

bacteriophage: strongly select for phage resistance
- may indirectly select for less pathogenic bacteria

investigated selection by parasitic phages and thermal environments on evolution of bacterial pathogenicity

microcosm experiment with serratia marescens
bacteria cultured with lytic phage PPV in low and high temp
changes in pathogenicity measured in vivo using wood tiger moth larvae

temp and phage treatment had little effect on bacterial growth traits

previous phage exposure allowed larger population sizes in subsequent exposure

S. marescens becomes more pathogenic when cultured at 37c without phage, but not with phage.
is it due to reduced maximum population size caused by phage??

motility maladaptive in presence of phage, but good for pathogenicity?

———————-
cell-to-cell electron transfer in geobacter
Zarath Summers
———————–

interspecies hydrogen transfer — one organism exports H2 other exports H+ to reduce CO2 to CH4

Geobacter, non fermenters, iron citrate respiration
Pelobacter, syntrophic

create cocultures of sulfurreducesn and metallireducens
syntrophic when growing on ethanol fumarate media.
metallireducens oxidizes ethanol, sulfurreduces consumes H2+acetate, reduces fumarate.

grew slowly at first.
serial transfer to select for fast growers on the media.
large aggregates emerged!
1mm in size.  large spherical structures by transfer #30.

channels in aggregates, possibly to promote exchange with the environment around them

One organism exists in 15% abundance, other is 85%.

spatial arrangement determined by FISH, nice mix of organisms.

solexa resequencing from transfer 15.  found single mutation in pilR regulator that causes frameshift and premature stop codon.

possible up-regulation of OmcS.

conductive pili — proposed mechanism of syntrophy.
interconnected web of pili in the aggregate, electrons go thru pili from metallireducens to sulfurreducens

test conductance using 2 plate gold with aggregates in gap between plates.  yep, conducts electricity.

very cool!

Q: able to separate the species from aggregates?
A: no

Q: can the organism abundances be explained metabolically?
A: the metallireducens (electron donor) organism is in lower abundance, not sure yet about metabolism.

ISME 13 Day 1

Tuesday, August 24th, 2010

Time to finally break the 9 month digital silence. I’ve had many posts in the works in my head, including the publication of progressiveMauve and it’s coverage in genomeweb, surprising bugs in progressiveMauve (to be continued), reviews of papers I’ve read, neat software engineering tricks I’ve learned, and various hypocrisies of science that seem to abound.

That will all have to wait, because today I’m posting my notes from the ISME 13 conference. ISME is an acronym for International Society for Microbial Ecology and the society meeting occurs every two years, this time in Seattle. I wasn’t planning to attend ISME, but Jenna Morgan has a poster here for work on which I’m a coauthor and she couldn’t make it so I’m presenting the work on her behalf. Last time I attended the meeting in Cairns I thoroughly enjoyed it, and yesterday’s talks and posters did not disappoint either.

Without further ado, here are some notes…
————————–
AmpliconNoise
————————–
I only caught about a third of this one.
Rare biosphere seems to be a sequencing artifact created by protocol noise and the 3% OTU cutoff
Open source software method to clean up 16S pyrotag data:

http://code.google.com/p/ampliconnoise/

YAY! open source
Substantially reduces inferred OTU counts, makes that pesky rare biosphere disappear.

———————————
Tal Dagan talk on Lateral Gene Transfer
———————————

60% of genes have lgt

analyzed about 650 genomes

directed network gains/losses
origins inferred using codon bias

non-homologous end joining proteins
200 genomes encode both subunits

nhej positive organisms appear to acquire from more distant organisms

———————————
widespread HR in Streptomyces
Daniel H Buckley
———————————

defines LGT to be allelic replacement

Fraser 2006 rates decline log-linearly with seq similarity based on MLSA analysis

some organisms very clonal, others recombining wildly at the tips of the tree

now let’s look at wild isolates

Streptomyces soil organism, wet dirt smells like Streptomyces B.O.

Half of antibiotics coming from Streptomyces

Sporulating — exospore. hyphae grows from spore, dna replicates during hyphae growth. hyphae form mycelium.

Streptomyces can do dsDNA exchange. depends on traB, atp dependent dsDNA translocator.

traB localizes at tips of hyphae, speculates traB might mobilize entire chromosomes during hyphae fusions

Someone designed MLST for Strepto

Six loci, not a single pair of congruent trees.

Used RDP to infer some recombination events. Signal looks a bit blurry.

ClonalFrame tree of Streptomyces trees

> 40% of genes impacted by HGT

HGT gene pairs have average 6% divergence — events are probably old.

did some sampling of S. flavogriseus pratensis from around eastern half of US.
samples are 99.8% identical on average
within “species” rho/theta = 27
cross species rho/theta < 1.
suggests cohesion

forward time simulation under a neutral model

estimates of divergence and rho/theta from Vos and Didelot approximately match the neutral speciation model predictions

Konstantinidis 2008 “valley of genetic discontinuity” — lack of reads recruiting at 86-92% identity

Fred Cohan asked a question or rather, made a statement about recombination driven cohesion and ecotypes.  It seemed like he was suggesting that ecotypes exist only in the niche-specifying genes, and the recombining core genes don’t give the data necessary to identify the ecotype.  So the ecotype model gets pushed further into the margins of the genome and we now have something Jeffrey Lawrence might call “ecotypes in pieces”  (for the unfamiliar, that’s a reference to Lawrence’s species in pieces work).

————————–
Broad host range plasmids LGT in bacteria
————————–

How frequently do bacteria swap DNA and plasmids with each other?

What is the host range of particular plasmids?
How do those plasmids evolve over time?

Use a model plasmid: IncP-1
simplified the plasmid by removing various transfer genes

Use Shewanella oneidensis Mr-1 as a host

Ancestral plasmid unstable, but can naturalize into a new host over 1000 generations.

How did they adapt?
Can they adapted plasmids be moved to yet another new host? Not always.

many mutations localized in the replication protein TrfA1

evolved plasmid genes have copy-number amplification

conclusions: drug resistance plasmids can rapidly adapt to new hosts

single gene changes can facilitate host shifts

—————————————
GeneFish
—————————————

create a recombinant E. coli to capture environmental DNA

environmental DNA ends up on plasmid

Simplify metagenome protocol by getting environmental DNA directly into the cloning vector.

avoid DNA extraction bias, other biasing steps. (but isn’t cloning a bias??)

get env DNA into plasmid, select for it using colE3 relF toxin system, grow on plates.

selectable recombination seems to increase recomb frequency by 4-5 log units.

Now looking to test the system.
Use NarG gene. design homologous regions to narG gene for the plasmid
use 500bp homologous region
recombination frequency 10^-6
toxin selection effectiveness 25-50%?
tolerates 80-100% divergence from homologous region
not very good??

———————————————
Hinsby Cadillo, Sulfolobus sympatric species
disclaimer: I am a collaborator on this project
———————————————

S. islandicus genomes from the same location

39 strains from Mutnovsky formation, Kamchakta

MLSA some large clonal complexes, a few rare groups, some intermediate strains.

ClonalFrame infers presence of some recombination

12 genomes from M16 spring

2.285Mbp core genome, build a phylogeny based on core genome

ClonalOrigin analysis
higher than expected rates of exchange among closely related organisms, lower than expected exchange among more divergent organisms

geographically isolated populations do not recombine much at all

Recombination could be mediated by pili in S. solfataricus

respiratory nitrate reductase has a presence/absence pattern consistent with the phylogeny

gene content specified niche separation, within-niche cohesion by homologous recombination

biotorrents is born

Thursday, November 12th, 2009

I just posted my first biotorrent to biotorrents.net the other day.  What is biotorrents, you ask?  As the name suggests, it’s a BitTorrent tracker site for tracking biological datasets.  cool, huh?

Last year when I was working at a research institute in australia, I found that whenever I needed to download a new version of the NCBI databases I would have to get them from NCBI’s ftp site in the USA.  That may work well enough for people located stateside, but the transpacific pipes do not seem to treat blast databases as priority electrons (photons, whatev).  Usually after a few days and several dropped TCP connections I would finally have the whole enchilada.  Not pleasant, but with a little persistence it was possible.

Enter biotorrents.  Now I can fire up my favorite bittorrent client and download the NCBI database from any of a number of globally distributed seeders, which should work much faster.  That’s the dream anyway.  In reality biotorrents is in early days and we need people to contribute bandwidth to the effort by seeding things like the NCBI databases.  Ideally, our great leaders at NCBI, EBI, and elsewhere would take the initiative and contribute by seeding their own databases.  David Lipman and Ewan Birney if you’re listening, consider this a public challenge!!

New Mauve release

Thursday, November 12th, 2009

I’m happy to say that a new Mauve release with many bugfixes has been made official today.  Developing and maintaining Mauve has been a challenge for me over the years, and each release requires a seemingly tremendous amount of effort.  Mauve has somewhere between hundreds and thousands of active users, each of whom seems to be running a different version of their favorite operating system and Java virtual machine.  Every time we need to do a new Mauve release, we have to struggle to ensure that we haven’t broken functionality on the myriad of software configurations.  Currently this is done using a slew of virtual machines running different operating systems, but the process is hardly automated.  Clearly one goal for the future of Mauve would be to do more automated quality testing of the software.  The automated testing might detect some types of problems really well, but others, such as the recent problem with OpenJDK drawing the display incorrectly, seem like they would be nearly impossible to detect programmatically.  Nonetheless, if automated software testing can help find a problem before an unsuspecting user trips over it, surely it’s worth the effort.  no?

The PhyloCoding has begun!

Saturday, June 13th, 2009

Every year for the past several years, Google has operated a charitable program called Google Summer of Code to support development of open source software.  Organizations that develop medium to large open-source projects apply to Google for support.  Accepted organization create an “ideas list” of projects that would enhance their open source software.  Students from around the globe then apply for projects with the accepted organizations.  Successful students are paid by Google to work on the open-source project for the summer. Competition among organizations and students is stiff, with only 1000 of 5000+ students being accepted.

This year I’m co-mentoring a project with Marc Suchard that aims to develop a small and reusable open source software library to calculate phylogenetic likelihoods using CPUs and GPUs.  The project is part of this year’s Phyloinformatics Summer of Code which is being operated by the National Evolutionary Synthesis Center (NESCent) through Google’s program.  Lots of students applied for the project, perhaps because GPU computing is trendy and computer geeks tend to be some of the trendiest people around (even if not always socially graceful!).  Nonetheless there were many strong applicants and in the end, the successful student was Daniel Ayres, a Ph.D. student at UMD.

The project is now well underway, with all sorts of development activity by mentors, mentee, and other folks interested in the notion of a resuable library for phylogenetic likelihood models.

Thanks to Google’s charitable arm for supporting so many students and projects!

Hypocrisy inside open access journals

Sunday, February 1st, 2009

Update 2: Peter Binfield writes in the comments below that PLoS One has begun accepting LaTeX.  Hooray!

Update: someone pointed me to the Topaz project, which looks promising!

I am currently preparing an article for submission to an open access journal (PLoS One, to be specific).  I have just learned that PLoS One, like many other journals, requires all articles to be submitted in either .doc or .rtf format. But why do I care?  My article was originally written in the open-source LaTeX system and intended as a conference contribution.  The article deals heavily in math and statistics and makes use of LaTeX’s excellent equation typesetting abilities.  As far as I can tell, it’s no simple matter to convert a LaTeX document with equations to M$ Word format.

How can it be that the leaders of the open-access journal movement require submissions in a closed and proprietary format?  Didn’t the open-access journal movement draw at least some of its inspiration from the free software movement that predated it by at least 10 years?  I presume the answer to this question lies at least partially with the proprietary nature of publishing and typesetting systems in common use at publishing houses.  The good people at PLoS probably made a decision to purchase existing proprietary publishing software for their operation rather than investing in an alternative that supports open standards.  And sadly, they now probably view change as too expensive.

To their credit, the topical PLoS journals do accept papers written with open-source software such as LaTeX, but that policy has only been in place recently.  The editorial office converts LaTeX submissions on a case-by-case basis.  Last year I published a paper authored in LaTeX in PLoS Genetics.  While I was very happy that I didn’t have to do the conversion myself, I think that the PLoS approach (and that of other journals) essentially amounts to applying band-aids to a broken publishing system.  It is not a good long term solution.

We need a scientific publishing system that is founded on open document standards and open source software.  Viable alternatives such as OpenOffice exist, yet I can not rely on OpenOffice to save complex equations in Microsoft Word documents (it works fine in the native OpenOffice format).  PLoS should lead the way in revolutionizing scientific publishing, and they should start on the inside by developing a publication process based on open standards.  After five years of PLoS, why are we still without a viable open-source platform for scientific publishing?

In the meantime, I have to carefully consider whether it’s a more effective use of my time to painstakingly convert my document to Word and support the status quo, or whether I should instead spend that time adding content that would make my article appropriate for a journal that will accept LaTeX.  Reformatting documents is mind-numbing, while submitting elsewhere might actually involve some interesting work.

MSI Wind U100: First impressions

Friday, November 7th, 2008

For the past three years I’ve been carrying around a Dell X1 laptop (originally designed by Samsung as the Q30).  In the past months it’s begun to show signs of old age, the batteries no longer hold much charge, I’ve maxed the HDD and don’t want to offload more data, the screen is fading, and there’s a large divot in the trackpad’s left mouse button where my thumbnail hits. hehehe.

It was time to get a new laptop and I decided to find out why people are making so much noise about netbooks.  After poking around a few reviews, I narrowed down my candidates to the 10″ Asus eeePC, the MSI Wind U100, and the Acer Aspire One.  I finally ended with the MSI Wind based on its 160gb hard drive, the possibility to buy a 9 cell battery which will last 6+ hours, claims of solid build and a respectable keyboard, and supposedly little heat and little noise.  The major downside of the Wind is that it only ships with Windoze, so yet again I was stuck paying MS tax.

The laptop arrived in the mail this week,  and my first impression is that the reviews were generally spot on, except when it comes to noise.  The MSI Wind has a fan and a 2.5 inch spinning-hunk-o-metal hard drive inside, both of which can make a raucus if you’re sitting in a seminar room.  To put this in perspective, I’ve been using a Dell X1 for the past three years which has NO fan and uses a nearly silent 1.8″ hard drive.  Of course the problem with the X1’s lack of a fan is that it can get quite toasty even when doing basic computing like web browsing.  Why oh why did Transmeta have to die?

A new life, a new laptop

Friday, November 7th, 2008

I’ve just moved to Davis, California, where I’m working in the laboratory of Jonathan Eisen.  It’s a continuation of the very generous 3-year postdoctoral fellowship awarded to me by the National Science Foundation.  I spent the first two years at an unspecified research institute in Brisbane, Australia.

I was hoping my first post would be a rant about the scientific tradition of manuscript review by secret committee, complete with a personal example, but that will have to wait.  In the meantime I’ve just got a new laptop and I’m having loads of fun (err, problems?) configuring it that I feel I should share with the world.