python代写 python作业代写 python程序代写 代写python程序

python程序作业代写, python代写, python程序代写, python编程代做

1 

TIE2030 Programming Methodology with Python 

Take-Home Assignment 

 

Release Date: 12 November, 2021 (~2pm) 

 

Deadline: 15 November, 2021, 11pm Late Submission Deadline (25% Penalty): 

17 November, 2021, 11pm No submissions will be accepted after 17 November, 

2021, 11pm 

 

 

Details on what you need to upload can be found on Page 9. 2 

Problem Statement: In this assignment, you will be searching for a given 

list of motifs (small size fixed patterns) in a given list of DNA sequences

DNA sequence is made up of fundamental Amino Acids – A (adenine), G 

(guanine), C (cytosine), T (thymine). Motif finding is an important problem 

in the Bioinformatics domain. Motif finding helps to understand several 

common features between species, allows us to understand human diseases, and 

helps drug manufacturers to target towards manufacturing certain drugs. 

 

Input and Output Data

Following are given to you in this assignment. 

• 

List of DNA sequencesTo be read from the sequences.txt file. Each line 

of the file contains one DNA sequence. 

• 

List of MotifsTo be read from the motifs.txt file. Each line of the file 

contains one motif. 

Write a Python program to perform the following analysisYour outputs must 

be written with clear messages to the DNA_analysis_results.txt file. 

 

What do you need to do in this take-home assignment? 

You need to write a program that searches for ALL the motifs given to you 

in each DNA sequence, generate the output and store in a dictionary. Also, 

compare the list of sequences given to a target sequence and report the 

statistics. You may use any built-in library function, if needed. 

 

ANSWER ALL THE FOLLOWING: 

 

Read through all the questions and the skeleton code in Page 7 before you start 

coding. Use the skeleton code. All of the functions specified from Questions (2), 3 

(3), (4), (5), (7) must be called from your main() function [Note: No parameters 

are passed into main() function]. 

(1) 

In your main() function, read the motifs from the file motifs.txt and store 

them in a list. Create a dictionary Motif_Count_Dictionary that takes the 

motifs you have read from the file motifs.txt as keys, and their values are 

initialized to zero. Read the DNA sequences from the file sequences.txt

Write each DNA sequence and its length with a clear meaningful message 

to your output file DNA_analysis_results.txt. See the sample output in 

Page 8

 

 

(2) 

Write a Python function Nucleo_Counter(…) and pass each of your DNA 

sequence and other parameters needed for this function. Call this function 

from the main() function. The function must count the number of 

occurrences (frequencies) of each nucleotide A, G, C, T in the DNA 

sequence you pass in. Write your counted values with a clear meaningful 

message to your output file DNA_analysis_results.txt. See the sample 

output in Page 8

 

(3) 

Write a Python function Motif_Counter(…) and pass each of your DNA 

sequence, your Motif_Count_Dictionary, and other parameters needed for 

this function. Call this function from the main() function. The function must 

count the number of occurrences (frequencies) of each motif that you 

have read from the file motifs.txt and accumulate the counts to the 

corresponding fields in your Motif_Count_Dictionary. For example, the 

number of occurrences of motif TC must be added to the entry with key TC 

in your Motif_Count_Dictionary

Write your counted values with a clear meaningful message to your output 

file DNA_analysis_results.txt. See the sample output. 4 

(4) 

Write a Python function Freq_Counter(…) to determine which motif 

most frequently occurs (maximum frequency) and which motif least 

frequently occurs (minimum frequency) in the given DNA sequences. Pass 

your Motif_Count_Dictionary and other parameters needed for this 

function. Call this function from the main() function. Write your results – 

corresponding motifs and their frequencies, with a clear meaningful 

message to your output file DNA_analysis_results.txt. See the sample 

output. 

Important Note: If there are more than one motifs that occur most 

frequently and least frequently, write all of them to your output file. 

(5) 

Define a target sequence Target_Seq as following (refer to the skeleton 

code on Page 7): 

Target_Seq = ‘ATGGGGAATGCGCAATGCAACGTAATTTAGAGGAGCCCCAGTTTGAAAGT’ 

Write a Python function Target_Search(…) to compare each sequence in 

your given DNA sequences against the target sequence Target_Seq. Pass 

each of your DNA sequence and other parameters needed for this function. 

Call this function from your main() function. The function must perform the 

following: 

Count the number of elements matching exactly in the respective 

locations between the DNA sequence you passed in and Target_Seq. This 

gives the “similarity” between that DNA sequence and the target sequence 

Target_Seq. Return this value from your Target_Search(…) function to 

your main() function. 

For example, given a target sequence: 

 

ATGTAAAGCCTATAGTGGGGC 

and a DNA sequence, say: 5 

 

ATGTTTTGCCTATAGTATGGCATAGTAGTA 

the similarity score between above example sequences is 16

After finding all the similarities, in your main() functionfind the 

sequences that are most similar and the sequences that are least similar 

from Target_Seq. Print your results with clear meaningful messages to 

your output file DNA_analysis_results.txt. Refer to the sample output. 

 

Important Note: If there are more than one sequences that are most/least 

similar to the target sequence, write all of them to your output file. 

 

(6) 

In your main() function, measure the time taken to run your analysis 

as required from Question 1 to Question 5 (the time to run from the start 

of your program to the end of your code for Question 5). Write the time 

you measured with a clear meaningful message to your output file 

DNA_analysis_results.txt. See the sample output. 

(7) 

Write a Python function Plot_Chart(…) to plot a bar chart that shows 

the total number of occurrences of each motif in all of the DNA 

sequences given in sequences.txt (the counts that you have accumulated 

for each motif in your Motif_Count_Dictionary). Clearly present your 

chart with all the required information, title, and axis labels, as shown in 

the sample output. Pass your Motif_Count_Dictionary and other 

parameters needed for this function. Call this function from the main() 

function. 

6 

IMPORTANT FEATURES: 

• 

Displaying your results with meaningful messages and clarity, writing 

meaningful comments (in addition to the comments given in the 

skeleton), and using meaningful variable naming also carry marks. 

Refer to the rubrics. 

• 

In your output file, you need to print the length, counts of nucleotides, 

counts of motifs, and similarity to the target sequence for each DNA 

sequence before proceeding to the next DNA sequence. Refer to the 

sample output. 

• 

Your code should be able to give the correct results for different DNA 

sequences and motifs (which also consist of A, T, G, C nucleotides), 

without changing the codeThat is, if we change some motifs and 

sequences in motifs.txt and sequences.txt, and run your code, we expect 

the correct results for the new input data in your output file 

DNA_analysis_results.txt. 

• 

For file writing, you can either write to the output file while processing or 

store your output messages in a list of strings and write to the output file 

at the end of the program. 7 

 

SKELETON CODE: 

 

Use the skeleton code below. 

Note: you must NOT declare any variables outside the functions other than 

Target_Seq. You are allowed to write additional functions if needed. 8 

SAMPLE OUTPUT: 

Important note: The results shown in the sample output below are obtained 

using DIFFERENT data from the files given to you. Therefore, the results shown 

below DO NOT match the results you obtain using the data in sequences.txt and 

motifs.txt. You should compute and check your results by yourself using the data 

in the two given files. 10 

 

 

 

RUBRICS: 

Part 

Description 

Marks 

For reading data from files; initiate the 

Motif_Count_Dictionary

printing the DNA sequences and their lengths to the output file. 

 

For the logic and execution of the Nucleo_Counter(…) function. 

For the logic and execution of the Motif_Counter(…) function. 

For the logic and execution of the Freq_Counter(…) function. 

For the logic and execution of the Target_Search(…) function; for 

finding the DNA sequences that are most and least similar to the target 

sequence. 

For measuring the processing time. 

For the logic and execution of the Plot_Chart(…) function; displaying 

the chart clearly with all the required information. 

Coding 

quality, 

output 

display 

For writing clear code and comments; using meaningful variable 

names; printing outputs with clear messages; following skeleton code 

and other requirements in the questions and template; use of function 

and parameter passing. 

 

Total 

 

WHAT IS THAT YOU NEED TO UPLOAD? 11 

Upload .zip file in the StudentSubmission_TakeHomeAssng_TIE2030 folder 

with name: _FINAL_ASSIGNMENT_< FIRST_NAME>.zip 

containing the following files: 

(1) 

Your working code CODE__FINAL_ASSIGNMENT_.py 

 

(2) 

Your report (convert to PDF after you complete) 

REPORT__FINAL_ASSIGNMENT_.pdf 

(3) 

Your output file DNA_analysis_results.txt. 

Refer to the template. Please follow the file naming strictly. 

京ICP备2025144562号-1
微信
程序代写,编程代写
使用微信扫一扫关注
在线客服
欢迎在线资讯
联系时间: 全天