Lab I : Jump In!
Behind the Scenes
If you feel unfamiliar with Excel, you should run through the Element K on Excel.
Start Your Report Document
Start Microsoft Word and put a title for the lab report and your name centered on the top of the first page.
Watch a Demo
Downloading Sample Data
Open a browser (probably Internet Explorer). I've put the data in an Excel worksheet on the web. You can download it by RIGHT clicking here. SAVE TARGET AS onto the desktop, your file server, or your portable storage device.
The reason we right click is because otherwise Excel starts up INSIDE Internet Explorer and generally makes like difficult for us.
After you've saved the file, go to the Start Menu and fire up Microsoft Excel.
Under Excel's File menu, select open and locate the file you just saved.
Make sure you know how to adjust column width and row size, change the font of a cell's contents.
Adjust so you can see our data conveniently.
Notice that the "spreadsheet" contains several worksheets labeled "Sheet1," "Sheet2," etc. on tabs at the bottom. Change the name of the sheet we are working on by double clicking the tab and typing a new name such as "raw data." Change the name of sheet2 to "Name Freq".
Clean the Data
Notice that some of the names have "leading blanks." This can confuse a computer so we start by "cleaning" our data.
Scan down the list and remove any leading blanks you find.
Do a Quick Bit of Reading about Pivot Tables
Look at this Microsoft site: http://office.microsoft.com/en-us/assistance/HA010346321033.aspx
Watch the instructor demonstrate what a pivot table is and then download the same file and try it for yourself.
Did you remember to right click and open the file from within Excel?
Name Frequency, First Cut
Skills: creating pivot table
We are going to start with an even simpler pivot table : just names and frequencies.
Select the data by clicking above the column containing the names in our first worksheet.
Select pivot table report under "Tools."
When you get to the dialog box that asks where to put the pivot table:
You now see the pivot table "Drop stuff here" screen. We have only one "field" — our names — to work with. Drag it to where it says "Drop row fields here." Suddenly there will appear one row for each unique name in our data.
Next, drag another copy of the field to the central "Drop data items here" area. Most of the time it doesn't matter what you drop here — it's just giving something for Excel to count. Excel will put a number in each row corresponding to how many times the name at the start of that row shows up in our data.
Sorting the Results
Skills: copy/paste special value, deleting cells, sorting
- The Pivot Table comes up in alphabetical order by name but we'd like it by frequency (with the most common name at top).
- To do this we need to make it so the table is no longer "live" (that is, hooked to the original data — to see what this means, go back to raw data and change the spelling of an Alexandra and then return to the pivot table and right click anywhere within it and select "Refresh data." Notice how the numbers change).
- Use the mouse to select the entire pivot table.
- Click copy.
- Under "Edit" click on "Paste Special" and then click "Values" in the next dialog box.
- We'd like to have column titles in the first row so let's delete the current first row. Highlight cells A1 and B1 where it says "Count of FIRST NAME1" and select "Delete" under Edit. When it asks how to move the remaining cells, say "move cells up."
- Now select the two columns and under "Data" select "Sort." Tell Excel to sort by the column "Total" and to do it in descending order.
- Erase the cells in row 2 since they are an artifact of the pivot table and there really were not 357 students named Grand Total.
- Select the first ten rows of the table, copy, and past into your word document to show you've gotten to this point. Write a few sentences explaining what the numbers are.
Collapsing Data Categories, Recoding
Skills: copying cell ranges, editing cell contents.
- Return to the Raw Data worksheet.
- Select our data and click on copy (in edit menu or icon on tool bar or control-c).
- Click on cell D1 and paste, creating a copy of our data.
- Scan down the list and notice that we have numerous cases of variant spellings of the same name (e.g., Sara and Sarah).
- Recode by retyping all instances of variants to one thing. For example, we might code Sara and Sarah as "Sara/h." Do this for the entire data set and then run another pivot table.
- Copy to your report again.
- Repeat a third time, doing a more radical recoding (e.g., all the variants of Catherine and Katrine and so on, maybe all the "Mic" and "Mik" names.
Looking Up the National Norm Data
- Google the social security administration's popularity of first names web site. When you have found it, look around and then get the top names for the birth year that is closest to typical first year college students.
- Copy these to your report and comment.
- Can you think of explanations for the differences between the Mills data and the national data? Can you think of any ways you could test your ideas?
- Pick out one of our most popular names and use the website to get data on how its popularity has risen and fallen over the last twenty years. Copy/comment for your report.
- Repeat for your own name.
Add a bibliography section to your report and cite the web site from which you gathered data.