← back

Deciphering Big Data Summary

Collaborative Discussion 1 - The Data Collection Process

We discussed the opportunities, limitations, risks and challenges of large-scale data collection.

In my initial post I wrote about the reasons why "big data" can be unreliable, and how it can be "cleaned" and then analysed.

Other students commented that I could also discuss the non-uniform structure of much "big data", and the possibility that data "outliers" could actually be something meaningful - a sensor detecting a real anomaly for example. In my summary post I added these ideas.

Collaborative Discussion 2 - Comparing Compliance Laws

We compared the rules of the GDPR with similar compliance laws in another region.

I lived in The Netherlands at the time, so I compared the UK GDPR regulations with the EU GDPR regulations. I found that after Brexit the UK regulations were simply copied from the EU regulations, but over time differences have developed. The EU has a list of countries to which data transfers are allowed, but the UK has a different list. The EU GDPR is part of the European Data Protection Board while the UK GDPR is not, but it maintains cooperation via separate agreements.

Team Exercise - Development Project

Our team exercise was to imagine a client organisation which needs to store data in a database, then propose a database design for their data.

Our team worked very well together, and we had several meetings using group discussions and video calls. We quickly agreed on how to divide up the work and bring it together in our final report.

We were very pleased with our final product, but in his feedback our tutor said we didn't fully meet the requirements. Our report was too technical (it included Python and SQL code) and we should have included a critical discussion of the available options, rather than just describing in technical detail the option we chose.

On reflection I think we misunderstood the assignment - we were explaining our solution to the tutor, rather than recommending our solution to the imaginary client.

I learned that it is vitally important to carefully read the assignment instructions and not just assume I understand what is required. In future I plan to spend more time on this step, rather than rush to get started.