Lab 4

Lab 4

Overview

In this lab you will come up with an algorithm to traverse a tree for desired results. There will be no homework or coding associated with this lab.

Phylogeny of Snakes In February of 2024, A giant anaconda species thought to be the largest in the world has been captured deep in the Amazon of Ecuador by a team of scientists from The University of Queensland. The group of scientists, led by professor Bryan Fry, uncovered the nearly 10 million-year-old species with help from the indigenous Huaorani people while filming “Pole to Pole with Will Smith,” a National Geographic series streaming on Disney+ and hosted by the Oscar winner. You can read more about the story here

Scientists use phylogenetic trees to show the evolutionary pathways and connections among organisms. Phylogeny is the representation of the evolutionary history and relationships between groups of organisms. These relationships are determined by phylogenetic inference methods that focus on observed heritable traits, such as DNA sequences, protein amino acid sequences, or morphology. A phylogenetic diagram can be rooted or unrooted. A rooted tree diagram indicates the hypothetical common ancestor of the tree. Below is the illustation of the rooted phylogenetic tree of snakes taken from Lee et al. (2007).

Last Common Ancestor Given a rooted phylogenetic tree, you are going to help our scientists find the last common ancestor of two given snakes species. For example, the last common ancestor of Cylindrophis and Anomochils is Uropeltidae species. More formally, last common ancestory of nodes u and v on a Tree is defined as the lowest (i.e. deepest) node that has both u and v as descendants, where we define each node to be a descendant of itself (so if u has a direct connection from v, v is the lowest common ancestor).

Input

Output

Step 0: Creating a markdown file for writeups

Step 1: Represent the snake phylogeny

At an abstract level, describe how you will represent the phylogeny tree. In your answer include: What are the nodes of the tree? How are the nodes connected to each other? What is the root of the tree? What are the leaves?

Step 2: Design an algorithm for searching for a desired node

Devise a recursive function that discovers a path from the root to the desired node. In your write up include: Clear concise pseudocode Justification of correctness Analysis of time and space complexity

Step 3: Design an algorihtm for the lowest common ancestor

Write a function to find the last common ancestor within a tree for two strings u and v. Hint: think about how you could use the function from the previous step. Once you’ve worked on these parts, check in with the instructor/ TA to make sure that the algorithm is as efficient as it can be.

Step 4: Analyze time and space complexity of your algorithm

State clearly the worst-case time and space complexities for a phylogenentic tree with a total of $n$ snake names, for the following two scenarios. Please show work and only approximate for asymptotic behavior at the very last step.

  1. Having a completely balanced phylogenetic tree. i.e., each node had all three children present and the tree is as shallow as it can be.
  2. Having a completely inbalanced phylogenetic tree. i.e., each node has one or two children only and the tree is as deep as it can be.

Step 5: Submit the writeup