Data Science Day: Q&A

This #DataScienceDay, we sat down with two of the team to ask them what it’s like to be a data scientist. They share their experiences, talk about their inspirations and achievements, and give us insight into the future of data science.

What inspired you to work in the field of data science?

Richard: “I’ve always had a passion for new technologies and studied artificial intelligence at university. I work with data daily and enjoy using new techniques to utilise data to solve problems. I returned to university to study Data Science and haven’t looked back and am now in a role where I can combine studies and skills with a genuine passion for crunching data.

Janice: “I’d not term myself a data scientist, however I have a huge interest in data, how we can utilise it to untap value and improve decision-making. Historically in oil and gas, formatting and consolidation of data was sporadic and without any standardised or organised approach, equipment tag numbers are a painful example of that! When I first got to play with Business Objects back in the early 1990s that was a joy, and now, thankfully, good data management is understood as the key to cost reduction and increased efficiencies.

Tell us about your pathway into a career in data.

Richard: ” I started off as a document controller and quickly became aware of the importance that metadata played in effective information management. I had a passion for programming and problem solving and saw that I could use these skills to streamline processes and make important information readily accessible. I did a lot of learning from online resources, picking up new skills to solve particular problems. Over time I realised I’d accumulated a skillset that was quite comprehensive.”

Janice: “It came through necessity, common sense and downright outrage. In the early days of my career, I experienced operators being charged huge sums of money by contractors to consolidate data when, in my mind, it should have been part of their contractual obligations. Fast forward to the almost present day, and it was even more obvious that were vast amounts of valuable, unused data that as buried in paper files, excel spreadsheets and handwritten reports. A goldmine for anyone interested in ‘freeing’ data!”

What is the most rewarding part about being a data scientist?

Richard: “My favourite aspect is finding new value in old data and figuring out a way to make that transformation possible, and as importantly – easy for the end user to interpret and understand. At Imrandd, our teams have a range of techniques to extract data from old reports, we then transform it into something more universal, like into an IDMS or a consolidated report, so the client can untap the value. That’s both challenging and rewarding.”

Janice: “The most rewarding part about working with data is making something out of what may seem like nothing! Working with data can be difficult and very frustrating. You can begin with a pile of ‘stuff’ with no apparent connection or pattern and gradually it becomes clearer. A key part is generally reducing the data / cleansing it / getting it into a format that can be useful. Preparation effort is probably about 80% of any data project.”

What is your biggest achievement to date?

Richard: “I’m still new to the field, studying data science at university on a graduate apprenticeship programme with Imrandd. It’s a great way to retain the skills I’ve been learning and cement the knowledge with real world application. With Imrandd I’ve had the opportunity to use machine learning to train a document classifier that’s used to filter reports being fed into AIDA EXTRACT, a piece of software that pulls information from inspection reports and converts it into data with actionable insights.

Most recently, I trained a convolutional neural network and a Siamese triplet network which I’m currently using to classify technical drawings.”

Janice: “That is a tough one! I’ve worked on fascinating integration projects and I’ve also loved working on engineering maintenance applications; in fact one of my early achievements was coordinating seven different tag formats and the associated data from several engineering companies into one set of data!

A further slightly more exotic activity was producing light curves from space data to look for exoplanets. Within my role at Imrandd , our team has produced several great AIDA products which make integrity data more accessible and useable.”

What advice would you give to aspiring data scientists?

Richard: “Don’t be put off by the maths and statistics. Formulae and equations can seem daunting to some, but I’ve found that understanding how they are derived is less important than understanding how and when to use them correctly.”

Janice: “From a practical perspective working with data is hard work but very rewarding. My key piece of advice is to get an understanding of any data before you do anything with it. That means looking at it – not just relying on tools. If it’s in a spreadsheet look at it line by line. What do you see? Can you get a feel for it? Build your own relationship with the dataset. It can be a bit like looking at a 3D Magic Eye picture – nothing is clear and then a pattern gradually emerges.”

What does the future of data look like?

Richard: “We’re living in a time where the importance of data is being recognised by all industries, but some industries are older than others and their legacy data isn’t in a state where it’s currently actionable. This means that there’s the burden of preparing this data before it can be used to generate insights and create value. Once that potential is unlocked, I believe we’ll see big changes and advances across all industries because of the insights provided by this data.

I think we’ll see data being used more outside of the workplace too. Everyone is wandering around with tiny computers strapped to their wrists or in their pockets, continuously collecting data. I’d love to see a complete overview of the data collected about me and see what new insights into my life it might unlock.”

Janice: “Bright! We are surrounded by data. It’s quite scary, but there are so many opportunities!”