Tag Archives: data

The best data visualisation work of 2018

Another end of year roundup, this time looking at data visualisation design.

Information is Beautiful Awards 2018: The Winners
Let’s raise a glass to dataviz that pushes boundaries, illuminates truth, and celebrates beauty. Thank you to everyone who joined us on the Information is Beautiful Awards journey this year – now see which entries took home trophies at tonight’s spectacular ceremony.

There is so much to pour over, here. Two that stood out particularly for me was this visual representation of a Beethoven string quartet and this unusual view of our lively planet.

Dynamic Planet Interactive Scientific Poster
The Interactive Scientific Poster „Dynamic Planet” was designed and developed for the exhibition „Focus Earth” of the GFZ German Research Centre for Geosciences in Potsdam. One of the main advantages of a digital poster is that it can display dynamic content. This is at the same time the essential statement of the scientific poster “Dynamic Planet”: our earth never stands still, is permanently shaken by earthquakes. These tensions are measured by three measuring points of the GFZ and their data is visualized in real-time in an interactive poster in the exhibition context. The viewer is given a direct impression, he can be a “witness” to current measurement and research.

data-visualisation-2018

This video demonstrates how people can interact with the poster, to navigate the large amount of data presented in an intuitive, visual manner.

Dynamic Planet – Scientific Poster
The challenge for the interface design was to ensure a clear overview, despite of the massively many events. The solution consists in an interactive graphical representation of the events by filtering the earthquakes by eg. magnitude and depth. A special visual feature of the scientific poster “Dynamic Planet” is the representation of the earthquakes depth in a transparent, rotating globe.

Introducing children to data visualisation

The economist and dataviz blogger Jonathan Schwabish took on an unusual challenge, to introduce his son’s primary school classmates to data visualisation.

I wouldn’t know where to start — I’m still not sure of the difference between a histogram and a bar chart — but cleverly, Jonathan begins with examples of diagrams everyone is familiar with. Maps.

Teaching data visualization to kids
I then introduced the term “choropleth” and showed them this map of graveyards in the US and this map of McDonald’s (a couple of kids actually tied the two together!). I also showed them a clip of Aron Koblins’ Flight Patterns project (my son loves this one)—the simple and intuitive animation, and black and white color scheme make it easy to follow. I also showed them a video of Martin Wattenberg and Fernanda Viegas’ Wind Map, again, something I think they could all relate to.

He then asks the children to draw their own maps, of their homes rather than the whole world, and to add in any data they liked.

I then passed out tracing paper and, bringing up the graphs I showed them earlier in which color, dots, lines, and bubbles were placed on top of the map, I asked them to plot any data they liked. … Could they add differently-sized bubbles to their favorite rooms? Could they draw lines showing their paths through the house? What about smiley faces for the most fun room?

children-data-visualisation-1

What a fantastic idea. I hope others are similarly encouraged to spread the word in this way. As he says in his conclusion, helping children to understand graphs is a good thing for many reasons.

I’d love to see a way to make data visualization education a broader part of the curriculum, both on its own and linked with their math and other classes. Imagine adding different shapes to maps in their Social Studies classes to encode data or using waterfall charts in their math classes to visually demonstrate a simple mathematical equation or developing simple network diagrams in science class. The combination of the scientific approach to data visualization and the creativity it sparks could serve as a great way to help students learn.

(Via FlowingData.)

Stolen millions

More announcements of company data (our data) being stolen. The numbers involved each time are just incredible.

Hackers breach Quora.com and steal password data for 100 million users
Compromised information includes cryptographically protected passwords, full names, email addresses, data imported from linked networks, and a variety of non-public content and actions, including direct messages, answer requests and downvotes. […] In a post published late Monday afternoon, Quora officials said they discovered the unauthorized access on Friday. They have since hired a digital forensics and security firm to investigate and have also reported the breach to law enforcement officials.

Whenever these stories are reported, the articles often end with a little summary of other recent snafus. The one above ended with:

Quora’s post is only the latest disclosure of a major breach. On Friday, hotel chain Marriott International said a system breach allowed hackers to steal passport numbers, credit card data, and other details for 500 million customers. In September, Facebook reported an attack on its network allowed hackers to steal personal details for as many as 50 million users. The social network later lowered the number of accounts affected to about 30 million.

A post from The Register, about that massive Marriott breach, concluded with this reminder of previous losses.

Marriott’s Starwood hotels mega-hack: Half a BILLION guests’ deets exposed over 4 years
Few hacks of individual firm’s customer data have come close to the scale of this one. The Yahoo! breach in 2013 saw three billion email accounts breached, while Carphone Dixons, the UK electronics retail chain, managed to lose control of 5.9 million sets of payment card data. In the US, the US Government Office for Personnel Management (which handles sensitive files on millions of government workers) had the personal data of 21 million employees’ breached by hackers.

A plan to reduce (some) teacher workload

Following on from the Ofsted Chief Inspector’s comments about teachers being ‘reduced to data managers’, the Education Secretary has written to all schools this week, reiterating his commitment to “clamp down on teachers’ workload.” Here’s the DfE’s press release.

DfE: More support for school leaders to tackle workload
Secretary of State for Education Damian Hinds said: “Many teachers are having to work way too many hours each week on unnecessary tasks, including excessive time spent on marking and data analysis. I want to make sure teachers are teaching, not putting data into spreadsheets. That’s why I am stopping my department asking for data other than in the school’s existing format.

“I am united with the unions and Ofsted in wanting teachers to do less admin. I have a straightforward message to head teachers who want their staff to cut right down on collecting data to be able to devote energies to teaching: I will support you. Frequent data drops and excessive monitoring of a child’s progress are not required either by Ofsted or by the DfE.”

Here’s a link to the Teacher Workload Advisory Group report and government response.

It will be interesting to see how this plays out and what, if anything, changes. Perhaps less data collection, rather than less data analysis: I can see schools making judgements about students’ progress two or three times a year, instead of three or four, but will they really stop analysing progress by pupil premium, SEN, ethnicity, gender and so on and so on? Perhaps this is aimed at classroom teachers, rather than subject leaders, data managers and SLT data leads.

Straightforward data science intro

This looks to be an interesting response to the call to be more data literate. Via Flowing Data, a straightforward and potentially free way to get skilled up with R, without needing to install any software, it seems.

Chromebook Data Science – a free online data science program for anyone with a web browser
The reason they are called Chromebook Data Science is because philosophically our goal was that anyone with a Chromebook could do the courses. All you need is a web browser and an internet connection. The courses all take advantage of RStudio Cloud so that all course work can be completed entirely in a web browser. No need to install software or have the latest MacBook Computer.

Here’s some info on what the courses cover, including introductions to R and GitHub. Worth a look?

Excel’s getting interesting. No, really

News that Excel will soon be expanding its range of data types, enabling a much richer and more dynamic experience.

Excel Data Types
AI powered Excel Data Types will transform the way we work with Excel by enabling a cell to contain much more than text, numbers or formulas.

There are currently two Excel data types available to Office 365 users; Stocks and Geography. Let’s start with the Geography Data Type that can take a table of countries and return rich data that can be referenced in Excel formulas and expand into further columns.

excel-getting-interesting-2

Mynda takes us through many other examples of how these new data types can be used and referenced in our spreadsheets. And it seems like this is just the beginning.

The Excel team have big plans for Data Types with more coming, including the ability to create your own data types unique to your organisation. Imagine data types for Employees, Products, Stores, Regions… the list is endless.

“Reduced”?

Have I just been insulted by the head of Ofsted?

Spielman: Teachers ‘reduced to data managers’
Teachers have been reduced to “data managers” instead of “experts in their field”, Ofsted chief inspector Amanda Spielman will argue today. “I don’t know a single teacher who went into teaching to get the perfect Progress 8 score (a measure of pupil progress),” she will tell a schools conference in Newcastle this morning.

I see her point, though that’s unfortunate paraphrasing from TES. The Guardian’s version, after the speech actually took place, also has the line ‘reduced teachers to the status of “data managers”.’

Here’s the full, less patronising, quote, with no reduction to be seen.

Amanda Spielman speech to the SCHOOLS NorthEast summit
The bottom line is that we must make sure that we, as an inspectorate, complement rather than intensify performance data, because our curriculum research and a vast amount of sector feedback have told us that a focus on performance data is coming at the expense of what is taught in schools.

A new focus on substance should change that, bringing the inspection conversation back to the substance of young people’s learning and treating teachers like the experts in their field, not just data managers. I don’t know a single teacher who went into teaching to get the perfect progress 8 score. They go into it because they love what they teach and want children to love it too. That is where the inspection conversation should start and with the new framework, we have an opportunity to do just that.

And here’s another write-up, that doesn’t mention us data managers at all.

Spielman: Focus on substance over outcomes will help tackle workload
The chief inspector said she did not think there was an “appetite to revive the inspection model of 20 years ago”, but that the new framework, which will come into effect next September, will build on some of the “strengths” of the current system, “especially letting leaders tell their own story”.

“I also want to rebalance inspector time usage so that more time is spent on site, having those professional conversations with leaders and teachers, with less time away from schools and colleges in pre and post-inspection activity.”

Follow the data

I’m hearing more and more about data ethics. It wasn’t ‘a thing’ before, was it? But it certainly is now. Here’s a very interesting take on it: flow.

The ethics of data flow
In Privacy in Context, Helen Nissenbaum connects data’s mobility to privacy and ethics. For Nissenbaum, the important issue isn’t what data should be private or public, but how data and information flow: what happens to your data, and how it is used. Information flows are central to our expectations of privacy, and respecting those expectations is at the heart of data ethics.

It’s not what they’ve got, but what they do with it that matters.

The infamous Target case, in which Target outed a pregnant teenager by sending ad circulars to her home, is a great example. We all buy things, and when we buy things, we know that data is used—to send bills and to manage inventory, if nothing else. In this case, the surprise was that Target used this customer’s purchase history to identify her as pregnant, and send circulars advertising products for pregnant women and new mothers to her house. The problem isn’t the collection of data, or even its use; the problem is that the advertising comes from, and produces, a different and unexpected data flow. The data that’s flowing isn’t just the feed to the marketing contractor. That ad circular, pushed into a mailbox (and read by the girl’s father) is another data flow, and one that’s not expected.

[…]

Everyone who works with data knows that data becomes much more powerful when it is combined with data from other sources. Data that seems innocuous, like a grocery store purchase history, can be combined with geographic data, medical data, and other kinds of data to characterize users and their behavior with great precision. Knowing whether a person purchases cigarettes can be of great interest to an insurance company, as can knowing whether a cardiac patient is buying bacon.

The article is written by and for data developers, primarily, and poses more questions than it can answer, especially around the thorny concept of data deletion. It’s an interesting read, but it left me wondering if those GDPR data protection principles will ever be fully put into practice.

Happy statistics day

It’s GCSE results day and, despite the new grading system, the news people are bringing out updated versions of their usual it’s-getting-better-it’s-getting-worse stories.

GCSE results day 2018: New ‘tougher’ exams favour boys as gender gap narrowest in seven years
Girls remain in the lead, with 23.4 per cent achieving one of the highest grades, which is the same as last year, compared to 17.1 per cent of boys, up from 16.2 per cent last year. But the gap in top grades between boy and girls is now at its narrowest since 2010, with boys just 6.3 per cent behind girls, down from 7.2 per cent last year.

GCSE results rise despite tougher exams
A total of 20 of the most popular GCSE subjects in England have been graded for the first time in the numerical format – plus English and maths, which were introduced in the new format last year. These include history, geography, sciences and modern languages, all of which have been designed to be more difficult.

Of those achieving all grade 9s – and taking at least seven of the new GCSEs – almost two-thirds were girls.

GCSEs: boys close gap on girls after exams overhaul
Boys appear to have been the major beneficiary of the overhaul of GCSE examinations taken in England for the first time this summer, as results showed across-the-board improvements in boys gaining top marks while girls saw their share of top grades dip.

Across the UK the proportion of students gaining an A or 7 and above, the new top grade used in England, rose above 20%, with boys in England closing the gap on girls with an almost one percentage point rise to 17.1% with girls unchanged at 23.4%.

In the reformed GCSEs in England, 4.3% of the results were the new highest 9 grade, set at a higher mark than the previous A* grade. The figures on Thursday showed 732 students attained seven or more grade 9s.

Despite the improvements by boys in England they were still outperformed by girls at the highest level: 5% of entries by girls received 9s, compared with just 3.6% of boys.

GCSE pass rate goes UP – but fewer students get new top ‘9’ grade compared to old A* mark
The overall pass rate – the percentage of students getting a 4 or above or a C or above – was 66.9 per cent, compared with 66.4 per cent last year.

But just 4.3 per cent of exams were given the new 9 grade, which was brought in to reward the absolute highest achievers. Just 732 students in England got a clean sweep of seven or more grade 9s.

Previously around seven per cent of exams scored the top A* grade.

The more detail-oriented education sector websites are worth a read, if you really want to dig down into all this.

GCSE results 2018: How many grade 9s were awarded in the newly reformed subjects?
There has been a curious amount of interest in how many students might achieve straight 9s in all subjects. It seems to have started with a throwaway remark on twitter by the then-chief scientific adviser at the Department for Education that only two students would do so. Tom Benton from Cambridge Assessment then produced some excellent research showing that it would, in fact, be several hundred.

Today Ofqual has answered the question once and for all. A total of 732 students who took at least seven reformed GCSEs achieved grade 9 in all of them. Given that fewer grade 9s are awarded than grade A*, it should come as no surprise that fewer students will achieve straight grade 9s compared to straight grade A*s.

But it can get a little heavy-going at times.

GCSE results day 2018: The main trends in entries and grades
Across all subjects, 21.5% of entries were awarded a grade 7/A or above, compared to 21.1% last year. At grade 4/C or above, 69.3% of entries achieved the standard this year, compared to 68.9% last year. Both figures have been on something of a downward trend since 2015, so this year’s figures arrest this decline.

GCSE and A-Level results analysis
Explore trends in national entry and attainment data between 2014 and 2018 in:
All subjects
Additional mathematics
Additional science
Art and design subjects
Biology
Business and communication systems
Business studies
Chemistry
Citizenship studies

GCSE 2018 variability charts: Are your results normal?
Each year Ofqual produces boring-sounding variability charts. It sounds dull but they show how many centres, i.e. schools or colleges, dropped or increased their results compared with the previous year. This means that if you dropped, say, 25 per cent in one subject, you can see how many other schools also saw the same dip.

Let’s give the last word to the JCQ and Ofqual, and have done with it: I’m getting a headache.

JCQ Joint Council for Qualifications: Examination results
Each year, JCQCIC collates the collective results for its members from more than 26 million scripts and items of coursework. We only publish collated results from our members though and cannot supply regional, centre or candidate information.

Ofqual Analytics
Ofqual analytics presents a selection of data in an engaging and accessible way by using interactive visualisations. We hope this innovative approach to presenting data will make it easier to understand and explore the data we produce.

Map of GCSE (9 to 1) grade outcomes by county in England
The map shows reformed GCSE full course results (the percentage of students achieving specific grades) in England by subject and county for the summer 2018 examination series as well as the summer 2017 examination series. Data in the map represents the results that were issued on results day for both years (23 August 2018 and 24 August 2017) and do not reflect any changes following post-results services.

happy-statistics-day-2

(All I know is that my boy got his GCSE results today too, and they’re a credit to the amount of time and effort he’s put in over the years.)

AI to the rescue

AI to the rescue

In 2016 the RNIB announced a project between the NHS and DeepMind, Google’s artificial intelligence company.

Artificial intelligence to look for early signs of eye conditions humans might miss
With the number of people affected by sight loss in the UK predicted to double by 2050, Moorfields Eye Hospital NHS Foundation Trust and DeepMind Health have joined forces to explore how new technologies can help medical research into eye diseases.

This wasn’t the only collaboration with the NHS that Google was involved in. There was another project, to help staff monitor patients with kidney disease, that had people concerned about the amount of the medical information being handed over.

Revealed: Google AI has access to huge haul of NHS patient data
Google says that since there is no separate dataset for people with kidney conditions, it needs access to all of the data in order to run Streams effectively. In a statement, the Royal Free NHS Trust says that it “provides DeepMind with NHS patient data in accordance with strict information governance rules and for the purpose of direct clinical care only.”

Still, some are likely to be concerned by the amount of information being made available to Google. It includes logs of day-to-day hospital activity, such as records of the location and status of patients – as well as who visits them and when. The hospitals will also share the results of certain pathology and radiology tests.

The Google-owned company tried to reassure us that everything was being done appropriately, that all those medical records would be safe with them.

DeepMind hits back at criticism of its NHS data-sharing deal
DeepMind co-founder Mustafa Suleyman has said negative headlines surrounding his company’s data-sharing deal with the NHS are being “driven by a group with a particular view to peddle”. […]

All the data shared with DeepMind will be encrypted and parent company Google will not have access to it. Suleyman said the company was holding itself to “an unprecedented level of oversight”.

That didn’t seem to cut it though.

DeepMind’s data deal with the NHS broke privacy law
“The Royal Free did not have a valid basis for satisfying the common law duty of confidence and therefore the processing of that data breached that duty,” the ICO said in its letter to the Royal Free NHS Trust. “In this light, the processing was not lawful under the Act.” […]

“The Commission is not persuaded that it was necessary and proportionate to process 1.6 million partial patient records in order to test the clinical safety of the application. The processing of these records was, in the Commissioner’s view, excessive,” the ICO said.

And now here we are, some years later, and that eye project is a big hit.

Artificial intelligence equal to experts in detecting eye diseases
The breakthrough research, published online by Nature Medicine, describes how machine-learning technology has been successfully trained on thousands of historic de-personalised eye scans to identify features of eye disease and recommend how patients should be referred for care.

Researchers hope the technology could one day transform the way professionals carry out eye tests, allowing them to spot conditions earlier and prioritise patients with the most serious eye diseases before irreversible damage sets in.

That’s from UCL, one of the project’s partners. I like the use of the phrase ‘historic de-personalised eye scans’. And it doesn’t mention Google once.

Other reports also now seem to be pushing the ‘AI will rescue us’ angle, rather than the previous ‘Google will misuse our data’ line.

DeepMind AI matches health experts at spotting eye diseases
DeepMind’s ultimate aim is to develop and implement a system that can assist the UK’s National Health Service with its ever-growing workload. Accurate AI judgements would lead to faster diagnoses and, in theory, treatment that could save patients’ vision.

Artificial intelligence ‘did not miss a single urgent case’
He told the BBC: “I think this will make most eye specialists gasp because we have shown this algorithm is as good as the world’s leading experts in interpreting these scans.” […]

He said: “Every eye doctor has seen patients go blind due to delays in referral; AI should help us to flag those urgent cases and get them treated early.”

And it seems AI can help with the really tricky problems too.

This robot uses AI to find Waldo, thereby ruining Where’s Waldo
To me, this is like the equivalent of cheating on your math homework by looking for the answers at the back of your textbook. Or worse, like getting a hand-me-down copy of Where’s Waldo and when you open the book, you find that your older cousin has already circled the Waldos in red marker. It’s about the journey, not the destination — the process of methodically scanning pages with your eyes is entirely lost! But of course, no one is actually going to use this robot to take the fun out of Where’s Waldo, it’s just a demonstration of what AutoML can do.

There’s Waldo is a robot that finds Waldo

We all need to be data literate

This article from Harvard Business Review doesn’t mention schools once, but I think it fits perfectly well in that setting.

The democratization of data science
Intelligent people find new uses for data science every day. Still, despite the explosion of interest in the data collected by just about every sector of American business — from financial companies and health care firms to management consultancies and the government — many organizations continue to relegate data-science knowledge to a small number of employees.

That’s a mistake — and in the long run, it’s unsustainable.

It goes on to outline the three steps necessary to create a more data literate organisation; share data tools, spread data skills, and spread data responsibility. Couldn’t agree more. It’s well worth a read.