So this is what I think I know, coming into bioinformatics from a engineering/layperson's perspective, and based on what little I remember from high school and freshman year biology classes:
- There are publicly accessible databases that contain the entire genomes of cells of all kinds of organisms, including humans, mice, tobacco, corn, yeast, E. coli, and many others.
- The genome for any cell is just a string of letters of DNA (nucleotides) in a particular sequence.
- The genome may be divided into pieces within the cell. These pieces are called chromosomes.
February 21, 2022:
To make my learning as hands-on as possible, I want to work off a list of concrete, specific questions for a particular organism, maybe an organism I can culture at BosLab. Thinking about the three points above, I think some good warmup questions for me to explore are:
- How can I access the genome for E. coli?
- Find some general info about the E. coli genome:
- Is it linear or circular?
- How big is it?
- Is it divided into pieces/chromosomes? How many?
- How do we navigate the genome?
- How do we specify particular locations in the genome? Is there a standard coordinate system, like numbering the nucleotides?
- How is the genome divided into sections? For example, human writings are usually divided into volumes, chapters, paragraphs, sentences, and words. Is the genome divided or organized in any way?
We will work on these tomorrow.