Regression problem for predicting the demand of bike-sharing services

 

We consider a regression problem for predicting the demand of bike-sharing services in Washington D.C.1 The prediction task is to predict the demand for the bikes (column cnt) given the other features: ignore the columns instant and dteday. Use the day.csv file from the data folder. (a) Write a Python file to load day.csv.2 Compute the correlation coefficient of each feature with the response (i.e., cnt). Include a table with the correlation coefficient of each feature with the response. Which features are positively correlated (i.e., have positive corre- lation coefficient) with the response? Which feature has the highest positive correlation with the response? (b) Were you able to find any features with a negative correlation coefficient with the response? If not, can you think of a feature that is not provided in the dataset but may have a negative correlation coefficient with the response? (c) Now, divide the data into training and test sets with the training set having about 70 percent of the data. Import train_test_split from sklearn to perform this operation. Use an existing package to train a multiple linear regression model on the training set using all the features (except the ones excluded above). Report the coefficients of the linear regression models and the following metrics on the training data: (1) RMSE metric; (2) R2 metric. [Hint: You may find the libraries sklearn.linear_model.LinearRegression useful.] (d)  Next, use the test set that was generated in the earlier step. Evaluate the trained model in step (c) on the testing set. Report the RMSE and R2 metrics on the testing set. (e)  Interpret the results in your own words. Which features contribute mostly to the linear regression model? Is the model fitting the data well? How large is the model error? 

 1https://www.kaggle.com/datasets/marklvl/bike-sharing-dataset?search=bike+demand+Washington& select=Readme.txt. You can also find a Readme.txt file that explains all the features in the dataset. 2Refer to https://docs.python.org/3/library/csv.html on how to load a csv file in Python. 

Place your order
(550 words)

Approximate price: $22

Calculate the price of your order

550 words
We'll send you the first draft for approval by September 11, 2018 at 10:52 AM
Total price:
$26
The price is based on these factors:
Academic level
Number of pages
Urgency
Basic features
  • Free title page and bibliography
  • Unlimited revisions
  • Plagiarism-free guarantee
  • Money-back guarantee
  • 24/7 support
On-demand options
  • Writer’s samples
  • Part-by-part delivery
  • Overnight delivery
  • Copies of used sources
  • Expert Proofreading
Paper format
  • 275 words per page
  • 12 pt Arial/Times New Roman
  • Double line spacing
  • Any citation style (APA, MLA, Chicago/Turabian, Harvard)

Our guarantees

Delivering a high-quality product at a reasonable price is not enough anymore.
That’s why we have developed 5 beneficial guarantees that will make your experience with our service enjoyable, easy, and safe.

Money-back guarantee

You have to be 100% sure of the quality of your product to give a money-back guarantee. This describes us perfectly. Make sure that this guarantee is totally transparent.

Read more

Zero-plagiarism guarantee

Each paper is composed from scratch, according to your instructions. It is then checked by our plagiarism-detection software. There is no gap where plagiarism could squeeze in.

Read more

Free-revision policy

Thanks to our free revisions, there is no way for you to be unsatisfied. We will work on your paper until you are completely happy with the result.

Read more

Privacy policy

Your email is safe, as we store it according to international data protection rules. Your bank details are secure, as we use only reliable payment systems.

Read more

Fair-cooperation guarantee

By sending us your money, you buy the service we provide. Check out our terms and conditions if you prefer business talks to be laid out in official language.

Read more