π
PROJECT
Willow
As part of my second year of university, I worked on a Systems Engineering group project. The group worked with Seldon, a London-based machine learning startup, to create a web tool that can be used to pre-process data for use with machine learning.
Background
Raw data collected from the real-world can be very messy, incoherent, and inconsistent. In most cases, the data have to be pre-processed to clean out inconsistencies within the data and have key features extracted before they can be used for machine learning. This process is often done manually using command-line tools and can take up a lot of time before any real work can be done on the data.
Project Aim
Create a web tool with a graphical user interface that can:
- Clean and transform datasets
- Analyze datasets and give useful statistics
- Visualize dataset to help spot patterns
Roles & Responsibilities
- Deputy Group Manager β Assisted the group manager with allocating work and communicating with the client.
- UI Lead β In charge of designing and implementing the user interface of the system, making sure that it is easy to use and aesthetically pleasing.
- Chief Editor β Responsible for writing and editing documentation such as the user guide and the project website.
Project Outcome
The group worked on the project for 6 months and were wholly responsible for planning, coding, and communicating with the client. The final product was a web-app that allowed users to upload their datasets in CSV and Excel formats, and provided options such as filling in missing data, feature encoding, and discretization to clean the dataset. The application also provided a range of visualization and analytical features to help users to find patterns within the data.
Our client, Seldon, was extremely impressed with the product, praising especially itβs user interface, and the product was featured in one of their blog posts. This project also achieved the highest mark out of the 40 or so projects in the same module.
The Project Website
One of my responsibilities as the Chief Editor for the project was to create a website for the project which would be used to document the progress of the project as well as to provide support for users of Willow and other developers.
Design Concepts
The website was split into two sections, one to document the progress of the project, the other to provide support and documentation. In order to create a clear distinction between the two sections intended for different audiences, different colour schemes were used. The layout, on the other hand, was kept similar to create a coherent appearance throughout the website.
The βcubesβ symbol was used throughout the website and other documents related to the project. The cubes represented the vast amount of data collected for use in machine learning, and while they are in no doubt valuable, they need to be organized in a special manner in order to actually be useful. This metaphor was used to strengthen the aim of the project and created a coherent theme across the website, the product, and other related materials.
Things I Learned
- Developing web applications using HTML, CSS, and AngularJS framework
- Working with Python modules including Pandas and scikit-learn
- Maintaining consistent code quality and collaborating with other group members
- Working in a professional manner with other group members as well as the client
- Developing a static website using HTML, CSS, and the Jekyll generator
- Building a responsive website
- Optimizing images and other assets to improve load-time