Lab 4
Lab 4
Overview
In this lab you will come up with an algorithm to traverse a tree for desired results. There will be no homework or coding associated with this lab.
Phylogeny of Snakes In February of 2024, A giant anaconda species thought to be the largest in the world has been captured deep in the Amazon of Ecuador by a team of scientists from The University of Queensland. The group of scientists, led by professor Bryan Fry, uncovered the nearly 10 million-year-old species with help from the indigenous Huaorani people while filming “Pole to Pole with Will Smith,” a National Geographic series streaming on Disney+ and hosted by the Oscar winner. You can read more about the story here
Scientists use phylogenetic trees to show the evolutionary pathways and connections among organisms. Phylogeny is the representation of the evolutionary history and relationships between groups of organisms. These relationships are determined by phylogenetic inference methods that focus on observed heritable traits, such as DNA sequences, protein amino acid sequences, or morphology. A phylogenetic diagram can be rooted or unrooted. A rooted tree diagram indicates the hypothetical common ancestor of the tree. Below is the illustation of the rooted phylogenetic tree of snakes taken from Lee et al. (2007).
Last Common Ancestor
Given a rooted phylogenetic tree, you are going to help our scientists find the last common ancestor of two given snakes species. For example, the last common ancestor of Cylindrophis and Anomochils is Uropeltidae species. More formally, last common ancestory of nodes u
and v
on a Tree is defined as the lowest (i.e. deepest) node that has both u
and v
as descendants, where we define each node to be a descendant of itself (so if u
has a direct connection from v
, v
is the lowest common ancestor).
Input
- A Tree representing the snake phylogeny. Each point of branching represents a species. We assume that there are no more than three branchings at each point. Hence, this is a ternary tree, as opposed to a binary tree.
- Two strings,
u
andv
. The two strings represent two distint snake species.
Output
- String
w
representing the name of the last common ancestor of the two snakes,u
andv
on the phylogenic tree.
Step 0: Creating a markdown file for writeups
-
One person in your team should visit https://demo.hedgedoc.org/new to create an empty document
-
Share the URL of the document created with the others in your team (e.g., email them the link in the address bar). When someone else visits the URL, they are able to collaboratively edit the document.
-
Each person should copy and paste the URL of the document into the Moodle submission box for the “Lab 4” assignment. Do this right away, even before writing in the document, so that your course account has a link to your work.
Step 1: Represent the snake phylogeny
At an abstract level, describe how you will represent the phylogeny tree. In your answer include: What are the nodes of the tree? How are the nodes connected to each other? What is the root of the tree? What are the leaves?
Step 2: Design an algorithm for searching for a desired node
Devise a recursive function that discovers a path from the root to the desired node. In your write up include: Clear concise pseudocode Justification of correctness Analysis of time and space complexity
Step 3: Design an algorihtm for the lowest common ancestor
Write a function to find the last common ancestor within a tree for two strings u and v. Hint: think about how you could use the function from the previous step. Once you’ve worked on these parts, check in with the instructor/ TA to make sure that the algorithm is as efficient as it can be.
Step 4: Analyze time and space complexity of your algorithm
State clearly the worst-case time and space complexities for a phylogenentic tree with a total of $n$ snake names, for the following two scenarios. Please show work and only approximate for asymptotic behavior at the very last step.
- Having a completely balanced phylogenetic tree. i.e., each node had all three children present and the tree is as shallow as it can be.
- Having a completely inbalanced phylogenetic tree. i.e., each node has one or two children only and the tree is as deep as it can be.
Step 5: Submit the writeup
- Each group member should submit the same markdown file to Moodle. In the submission comment add a link to the hedgedoc file and also indicate who you worked with.