Elizabeth Howell, May 13, 2019
Big Data, Big Challenges, Big Solutions
Big data research is growing rapidly at Carleton. Only four years after the Carleton University Institute for Data Science was founded, the initiative now includes more than 10 departments and 170 researchers working on big data-related projects. Carleton established a collaborative Master’s in Data Science and in 2018, established its own space in the 5400-wing of Herzberg Laboratories. The facility includes two computer labs, two conference rooms, and a meeting room to foster collaborations. And with this growth comes great discoveries.
Helping premature babies
James Green (Associate Professor, Department of Systems and Computer Engineering) was inspired by his own son’s experience in the neonatal intensive care unit (NICU) to launch a collaboration with the Children’s Hospital of Eastern Ontario (CHEO). His team is collecting four streams of data from 35 neonatal beds at the facility for periods of four to six hours per patient.
“We’re hoping to get respiration rates, patient movements and patient interventions – whether it’s for routine care, like changing a diaper, or clinical interventions. This all provides contextual information for discerning true vs. false alarms,” he said.
The four data streams include:
- Examining contact pressure on the mattress, from which one can estimate breathing rate.
- Colour, depth, and near-infrared video from above the patient. The depth component makes it easier to distinguish between different objects. The colour component captures subtle changes in the baby’s skin colour from which one can estimate heart rate.
- A bedside monitor, which extracts information from multiple sensors – blood pressure, pulse rate, breathing rate and the like.
- A custom tablet app that permits researchers to capture each event that transpires at the bedside in real-time. These data will later be used as gold-standard event annotations for creating machine learning systems to recognize different patient events. For example, a sneeze is not worrying – but a seizure is.
The project is jointly funded by IBM and the Natural Sciences and Engineering Research Council (NSERC) for $102,000 annually, from 2016 to 2019. Green’s team has applied to IBM to continue this project for another three years, with more emphasis on semi-automation and machine vision.
Building up proteins
Frank Dehne (Chancellor’s Professor of Computer Science and the Director of the Carleton Institute for Data Science) uses big data to examine proteins, the building blocks of living cells. “We all have our DNA, our genes,” he explains. “Every gene has a code for making a protein, and the proteins are the ones doing things in cells.”
Proteins are made up of biological chains (called polypeptides) built from smaller molecules called amino acids. The 20 amino acids in the human body have trillions of possible arrangements for functions such as defending against disease or regulating cell activity. It’s these complex arrangements that Dehne’s team is trying to model.
“The problem there is if you need to take two proteins and find out whether they interact, that takes a few days of experimentation in the lab. We humans have 20,000 proteins, so that’s 200 million combinations – you can see the problem,” he said. “What we’ve been doing instead is trying to build a system that mines all the data available and predicts interactions. We’ve done a lot of projects with medical researchers – people working on treating HIV or Hepatitis B.”
Dehne’s group is also working to create proteins, to improve how drugs are delivered. If medical researchers can create a protein with the right instructions and put it into drug form, the protein-drug can directly attach itself to malfunctioning proteins in the body. This has applications for institutions such as the Ottawa Hospital, which is looking at new ways of treating muscular dystrophy. Dehne also receives funding from IBM for data analytics tools.
Tracking birds and land cover
Lenore Fahrig (Fellow of the Royal Society of Canada, co-director of the Geomatics and Landscape Ecology Research Laboratory) is using big data to better understand the natural world. Many of her studies are in ecosystems in eastern Ontario, but some of her studies use data collected over much larger areas – even whole continents.
Their most frequently used big data are “land cover” data, for mapping the patterns of natural or artificial environments over large areas. From these data they extract information such as the density of roads in an urban area, or the pattern of agricultural areas. These data often come from remote sensing by satellites.
Fahrig’s use of big data takes her in many different directions. One growing area is the use of “citizen scientist” databases on birds. One post-doctoral student working under Fahrig was interested in bird movement, so she used citizen science data on bird observations, combined with banding data from professionals, and land cover data from satellites.
Another one of her team’s projects examines the spatial pattern of agriculture regions, including the sizes of crop fields and the diversity of crop types. Through collaborations within eastern Ontario and Europe, Fahrig examined how agricultural land patterns affect biodiversity. “That’s a huge data collection over several years,” she said. “For that project, we included funding for building a database to allow common formatting for all the data to be entered.”
Common funders for her work include the Natural Sciences and Engineering Research Council (NSERC) and Environment and Climate Change Canada.
Some other Carleton big data researchers include:
- Shikharesh Majumdar (Professor of Systems and Computer Engineering and the Director of the Real Time and Distributed Systems Research Centre), who is examining different techniques and platforms for processing big data on clouds and clusters.
- Tracey Lauriault (Professor of Journalism and Communication). She is also a research associate with the Programmable City Project, which is funded by the European Research Council at the National Institute of Regional and Spatial Analysis at Maynooth University, Ireland. Her expertise covers topics such as open data, crowdsourcing, data infrastructure and data preservation and archiving.
- Stephan Gruber (Canada Research Chair in Climate Change Impacts/Adaptation in Northern Canada) uses big data to understand how human activity and climate change affect natural systems and geohazards, especially permafrost thaw.
Carleton’s Institute for Data Science hosts an annual Data Day conference to honour ongoing developments in data science within Carleton University as well as the country at large. This year’s event included presentations, panel discussions and a speed networking session.
Share: Twitter, Facebook