Develop one question yourself that can be answered with the information included in this dataset. Write the code to answer the question, and include a visualization.

Python Problems

2 part

Overview

This assignment will allow you to practice algorithmic thinking and basic Python programming with several small-scale problems. As you solve each problem, follow the steps of algorithmic thinking as outlined below. NOTE: you only need to provide an algorithm, flowchart and test cases for part 2 (no algorithm/flowchart/test cases are needed for part 1).

Step 1: Algorithm Description. Use an algorithm and a flow chart to develop and express your algorithm that accomplishes the given task. Remember, you have to be very explicit and clear to make sure one can actually accomplish the task following your directions. Describe the input(s), output(s) and the process of the algorithm.

Step 2: Program Code – Implementation: Implement the algorithm in Python using the basic structures we covered in class (ONLY USE CONCEPTS COVERED IN CLASS):

User input

Variables

Operators

Conditional execution

For/while loops

Data structures

Functions and modules

Pandas

Step 3: Program Testing: Create a Test Plan with two or three test cases that demonstrate your code works as intended. Explain how you used these test cases in your comments.

Step 4: Program Documentation: Be sure to comment thoroughly so that it is clear that you understand what every line of the code is intended to accomplish.

Part 1: Data Analysis and Visualization

You will work with a dataset that contains information on a coffee shop’s sales. The dataset is below. DOWNLOAD THE DATASET AS A CSV FILE ON YOUR COMPUTER FROM THE LINK BELOW AND READ IT IN PANDAS FROM THERE. DO NOT READ IT FROM THE LINK BELOW.

Dataset: https://drive.google.com/file/d/141afTVoF0J2FjpLI-VfERyJM7aWUQ8az/view?usp=sharing

Variables:

transaction_id – transaction id

transaction_date – transaction date

transaction_time – transaction time

sales_outlet_id – sales outlet (A, B, C, D, E, F or G)

staff_id – id of the staff member

customer_id – ID of the customer

instore_yn – whether the sale was in the store (yes or no)

product_id – id of the product

quantity – quantity purchased

unit_price – price per unit (item) in USD

promo_item_yn – whether the item was on promotion (yes or no)

Question 1.

Import the csv file in pandas and save it as a dataframe. Then, write a code that returns: (1) the first 10 and last 10 rows; and (2) the number of rows and columns in the data set. Discuss what the code shows you about the data set.

Question 2.

Write a code that returns: (1) the distribution of sales outlets (including a count of each outlet type and a bar chart); (2) the minimum and maximum transaction_id; (3) the minimum, maximum and average customer_id; and (4) the distribution of products in bought in store (yes or no) using a pie chart.

Question 3.

You discover that the variable unit_price was incorrectly recorded. Create a new variable unit_price_corrected where you add 1.50 to unit_price for the first 100 items, and you subtract 1.50 from the unit price for the remaining items in the data set. Then, calculate and compare the average of unit_price and unit_price_corrected.

Question 4.

The coffee shop’s management wants to find out which of the outlets has the highest revenue. Calculate the total revenue for each of the outlets. Remember that total revenue will be unit_price_corrected multiplied by quantity. Also, present your calculations using a line graph. Explain what you found and what the chart shows.

Question 5.

The coffee shop’s management wants to find out how the staff are doing in terms of sales. For each of the staff ids, calculate the total product units sold and the total revenue sold. Provide two bar charts (one for total product units, one for total revenue) by staff id, and interpret your findings.

Question 6.

Develop one question yourself that can be answered with the information included in this dataset. Write the code to answer the question, and include a visualization.

Question 7.

Develop one question yourself that can be answered with the information included in this dataset. Write the code to answer the question, and include a visualization.

Part 2

You are hired to develop an online management system for a cafe. This program will be used by the café admins and will help them manage online orders. Use a function to develop a program with the following features:

Allow the café admin to enter the menu items until the user enters quit to stop. The list should include a minimum of 10 items. For example: main_categories = [Americano, Espresso, Cheese sandwich]

Use the main menu list you created in step 1 to create a dictionary that should contain the price of each of the menu items with their respective cost. For example: items_price= {“Americano”: 13, “Espresso”: 9, “Cheese sandwich”:15}

Use the main menu list you created in step 1 to create another dictionary that should contain the quantity of each menu item. items_quantity={“Americano”: 50, “Esspresso”: 30, “Cheese sandwich”:10}

Use the main menu list you created in step 1 to create another dictionary that allows the cafe admin to record the rating received from customers on menu items. The ratings are scored on a scale from 1 to 5, with 5 indicating the maximum customer satisfaction. For example: items_rating = {“Americano”: 4, “Esspresso”: 1, “Cheese sandwich”:5}

Your function should return the following data structures separately:

The dictionary that includes all entries.

A list named satisfied_item, which includes the items with satisfaction of 3 or higher.

A list named highprice_item, which includes the items with price above 10 .

A list named few_items, which includes the items with quantity less than 5.

For part 2 only: First, create a step-by-step algorithm and a flowchart and then translate it into a fully functional and documented Python code. Follow the flowchart shape conventions from the session 3 reading, available here.

Your assignment submission needs to include the following resources:

A .pdf file must be the first resource and it will include all the answers to the questions above, including all the python code you produce. Make sure that you submit a neat, clearly presented, and easy-to-read .pdf. The .pdf should be submitted under the name file “student_name.pdf”.

Your second resource must be a single zip file which should include the Jupyter Notebook with extension .ipynb and named “student_name.ipynb”, along with any additional files (e.g. pictures of your flowchart). ***You need to submit two files: (1) pdf file from the first step (primary resource) and (2) a zip file including the ipynb file from the second step (secondary resource).

Develop one question yourself that can be answered with the information included in this dataset. Write the code to answer the question, and include a visualization.
Scroll to top