Okay, I take it back: the equation on p.108 (printed) and p.120 (online) is not wrong after all. I was just using a different notation.

Immediately before the equation, I said:

Remember that the distance between two n-dimensional points x and y is…

So in this convention, x represents one person, say Joe, and x1 is Joe’s (say) salary, x2 is Joe’s average golf score, x3 is Joe’s hair length, and x4 is Joe’s salary. y is a different person (Tiffany), with y1 being Tiffany’s salary, y2 being Tiffany’s average golf score, y3 being Tiffany’s hair length, and y4 being Tiffany’s salary. In this case, then, the formula is exactly correct.

This is the opposite to what I was doing in class, where I had the two people being named 1 and 2, with x1 and y1 being person #1’s features/coordinates, and x2 and y2 being the other’s.

It’s all just a matter of definitions.

Sorry for the confusion!

From class today, the updated Jedi/Sith classifier with the performance measurement.

And right on the heels of quiz #5 comes quiz #6, also on the same material! You know everything you need to know to take both quizzes as of April 1st. Have fun with both!

And by the way, Quiz #6 is open-calculator so be sure to have one handy before you begin.

From today’s class, the galactic cruiser security camera data set and our Naive Bayes classifier code that did predictions on it.

Quiz #5 is pretty short (only 19 minutes max) and is due this Thursday at midnight. It’s in Canvas like all the others are. Good luck!

I got so confused with the contradictory answers various students were reporting to various homework #5 questions that I decided to just do the assignment myself from scratch and use that as a key. You can find it in the “Files” tab in Canvas, if you’re interested.

Buses will leave promptly from the Bell Tower at 2pm on Friday, April 5th, DataFesters! Please remember to pack your weekend bag, and bring your $20 (cash only) to the bus!

And for the losers not going to DataFest, there will be no class on Friday. ☺

As announced in the amphitheatre today, homework #6 is now due midnight, April 3rd.

Hey DATA 219’ers, you are cordially invited to the 7th Annual UMW Computer Science Department Programming Contest™ on Friday, April 12th! Free pizza, snacks, and drinks will be provided, and teams will begin the contest at precisely 7pm. Normally the fun lasts until about midnight, when the leading teams bear down for a hair-raising neck-and-neck finish. Come have fun and strut your stuff!

Programming teams can consist of two or three students (who may, but don’t have to be, fellow 219’ers).

For those who compete, I’ll award bonus points which you can apply to any one quiz of your choice:

  1. Two points merely for showing up, eating pizza, and doing your best!
  2. One additional point for each of the eight programming challenge problems your team gets correct!

Form your team and get excited! (If you would like to be put on a team, email me well in advance and let me know that as well.)

Homework #6, a partner-based assignment, has been posted, and is due the last day of March.

If you’d like me to assign you a partner, please email me right away with subject line “DATA 219 Homework #6 partner request!