Data Security
Until the very early 2000’s, many schools and colleges used a Social Security Number as a unique identifier for students. The number was posted in many places and made available to teachers on class rosters. My high school ID card, which students were required to wear, had my social security number in large print next to my name (along with a barcode). When I came to college, professors would often post your SSN on the wall with your grades. Sometimes they’d even include your name.
As identity theft became increasingly common, universities and schools moved away from using the SSN as a unique identifier - NC State made this switch in 2005. Even after that, you could Google your SSN and find where professors posted their rosters with SSNs on their web space so they could download it at home.
Nowadays, we’re much more security conscious - breaches like this would be major news stories and catastrophic events for which people would almost certainly lose their jobs. However, “not collecting and distributing SSNs” is only part of a data security policy. Schools should identify PII (personally identifiable information) and ensure that teachers only have access to the data that they need to do their jobs (principle of least privilege). For example, a birthday (month and day) is probably helpful for a teacher to know. However, a birthdate (combined with a name) is PII that can be used to steal an identity if disclosed. Schools have policies identifying public directory information (usually SIS/PowerSchool ID, name, address, grade level) as permitted under FERPA. Beyond that, schools typically have (or are developing) policies about what information staff in different roles are allowed to collect or have access to. North Carolina has rules that go beyond FERPA with the NC Student Privacy Act. The State Security Standards are also a good place to start.
Beyond identifying what data staff are allowed to collect, a CTO is often required to identify storage locations for the PII that teachers are allowed to have. In general, a laptop is a bad place to store data due to the risk of theft, and a flash drive is typically forbidden for the same reason. Conventional wisdom used to be that PII should only be stored on local servers. However, that is evolving as cloud infrastructure becomes more secure and organizations move away from traditional servers. NC State for example, allows the NC State owned Google Drive account to be used to store PII. The rationale is that because NC State requires two-factor authentication, it is more secure than on campus. The caveat here is that Google Drive makes it very easy to make files public, which is a risk that must be managed. Wake County Public Schools addresses this issue by providing both Google Drive and Microsoft One Drive - they only allow PII to be stored in One Drive because they have disabled the ability to share files outside of the Wake County staff network within the One Drive ecosystem.
It’s important to differentiate for staff the difference between GSuite and personally owned Google Drive accounts, for example. As discussed earlier in the course, Google has a separate data processing agreement for GSuite and a FERPA compliance statement. This allows data to be stored in an GSuite account that couldn’t be stored in a personal account (the same is true for any service like Office 365, Dropbox, etc.).
NC State has developed a data elements matrix that categorizes different elements based on the various laws. Green data is public, yellow data is quasi-public. Red data is restricted and purple data requires special handling to store. Along with the table of data elements, there is a storage locations table that defines where data may be stored. Many school districts are adopting similar tables for staff to understand who may access data. Some districts are also adding a third dimension which defines who data may be shared with. Any parent has legal rights to any record about their child under FERPA (however counseling records are confidential and many districts also keep library checkouts and Internet history confidential, unless there’s an issue). For other staff, the principle of least privilege suggests that they should only have access if their job requires them to have access to that data.
All of this falls under the umbrella of data governance: that systems and structures exist to ensure that the data collected are known and that collection procedures are followed and that data are properly handled. Other principles of data governance include:
- Data Inventories: Knowing what data are collected, where it lives, and how it’s used.
- Data Architecture: Structuring your data for the most efficient collection and use.
- Data Minimization: Collecting only the data that is needed, and no more.
- Data lifecycle: Destroying data when it’s no longer needed.
- Small Cell Suppression: In analysis of aggregate data, ensuring that data doesn’t become identifiable by the process of disaggregation (e.g., running an analysis on a kindergarten classroom and breaking down by race, sex, and economic status meaning only one or two students are in the analysis group).