Therefore, I will accept the values given to me by RStudio.TotalVolumeDonated = 102.088+36.848*MonthsSinceFirstDonationIn this Blog post I will be diving into my data in order to understand it and its implications more thoroughly.As established in my previous post, my data has 6 variable and 576 observations. Just type your one-line question into the search box below to see my answer. Make sure to tune in next time for my post on multicollinearity.To recap from last weeks post on Linear Regression, I created a model of my data and got an equation that summarized the data. It is impossible to expect someone to donate blood exactly when they are next able to and having a cyclical approach is logical. Oh yes, it’s Rubik’s Cube time once again. 10 years. For example, someone might want to donate blood regularly, but doesn’t want to have to keep track of when they last donated, so to be safe, they just donated every spring and fall.If I were to manage the marketing campaigns for blood drives, I would probably use these data figures to set up an awareness push around the holidays in winter, capitalizing on the altruism of the time as well as an awareness push during and around the 4th of July, capitalizing on the patriotism of Americans. Every time I think I’ve seen everything someone can do with a Rubik’s Cube, a month or two later, another amazing video appears on the Internet, proving me wrong. This is important for the field of blood donation because it means that they have to work on recruiting more people to give blood because the return rate for donors is much higher.For NumberOfDonations, we can say that there is an increased chance that someone will donate blood for each donation they have given.For MonthsSinceLastDonation, we can say that as the months increase, there is less and less of a chance that they will donate again.It is important to note that Donated’s units are years and LastDonation is in months so there is 1/12 as much impact on Donated as is shown. Soul food and funk music are my kryptonite. Jared Di Carlo and I built a machine that can solve a Rubik's cube really fast, and it became kind of internet-famous: For the first, I will notice if it goes from left to right and in the second, I will note whether the distribution is random or not.all of these plots do not prove a lack of heteroskedasticity which is a bad sign.
This is backed up by the dim function.Although there are only 576 observations, it is still sufficient for my purposes of data analysis.One thing about the data that would be important to dive through would be the TotalVolumeDonated variable as it shows who has given the most or least blood and could potentially help draw correlations between those who are truly dedicated to donating blood. Professional puppy petter. Posts about ben katz written by gmdirect. I will be adding the donated variable which states whether or not the person has donated blood before. This site will allow you to view your ongoing homework assignments. Below, I investigate how many donors have given more than 2000 cm^3 (or 8 donations) of blood and who has given less than 500 cm^3 (or 2 donations) of blood. The standard amount of blood donated is 250 ml, so if someone were to have donated once, they would have donated 250 ml and if they had never donated, it would be 0 ml.As seen in this histogram of months since first donation, there are a significant number of people who have been donating for a very long time, so the slope of the line has to be much less steep to accommodate for the left skew, thus driving the intercept up from 0 where it should be.To provide even more proof, the histogram of total volume donated is left skew as well, confirming our theory that those that have been donating a long time have also been donating consistently. This will return a new model of:TotalVolumeDonated = 265.13 + 39.71MonthsSinceFirstDonation – 42.18MonthsSinceLastDonation + 574.81DonatedIn order to determine correlation between my predictors, I will use the cor() function, which looks at correlations in pairs.When interpreting this matrix that the code spit out, it is important to note that it is symmetrical along the diagonal line. Now, I am going to explain the coefficients in a more real world context.Because my slope is significant, it is able to be extrapolated to create the formula: TotalVolumeDonated = 102.088 + 36.848*MonthsSinceFirstDonationStarting with the intercept, it is immediate clear that it is impossible to have already donated 102 ml of blood without ever donating blood, so there is no way for the intercept to have any real world meaning. The means are off by a full donation(250 ml) and I don’t think it would be fair to assume a model that may be off by a full donation is accurate.I am going to have to conclude that this model is not acceptable to use because it has to be stable and accurate to qualify.If you remember, I created a logistic regression model in Monday’s blog and today I will be testing the assumptions to make sure that it is a good model.log(p/(1-p))= -.74578 + .0718*NumberOfDonations – .10738*MonthsSinceLastDonationThis is the model equation that I created last blog and now I am going to do the six checks required of logistic regression on it.All the variables that I have included are relevant and I have excluded all those that aren’t, so I am just going to go ahead and say that my model passes the first assumption.For this, I am going to test multicollinearity between my two variables using the cor() function.Due to the correlation not being near 1 or negative 1, my model passes the second test.There is no code that I can write to test for independence, so I will rely on my intuition for this one. In the test, there are far more outliers around the 8000 mark. Two years later as a senior, I finally made a first-pass at building some of the hardware - Name: Ben Katz Hometown: Somerville, MA Degree from UMass: B.S.