Interactive Visual Interfaces

Project 1 Documentation:

Application Motivation:

This application takes a csv file full of exoplanet related data, and turns it into a usable format. More specifically, it shows how data compares among the listed planets (like number of orbiting planets to stars), the relationships between sets of data (like planet radius versus mass), and theoretically the interactivity when one subset of data is selected rather than the entirety of the file. Without this application, there are tens of thousands of pieces of information through which a user would have to sort. With it, it's far more manageable, visually appealing, and comprehensive.

Application Data:

The data used for this project can be found in this file and is sourced from this site. It's a tabular data set and contains 5243 rows of the following data attributes and their meanings:

pl_name: Planet Name
hostname: Host Name
sys_name: System Name
sy_snum: Number of Stars
sy_pnum: Number of Planets
discoverymethod: Discovery Method
disc_year: Discovery Year
pl_orbper: Orbital Period (days)
pl_orbsmax: Orbit Semi-Major Axis (AU)
pl_rade: Planet Radius: (Earth-Radii)
pl_bmasse: Planet Mass (Earth-Masses)
pl_orbeccen: Orbital Eccentricity
st_spectype: Spectral Type
st_rad: Stellar Radius
st_mass: Stellar Mass
sy_dist: Distance (pc)

Among these, there were numerous cells in which there was no data provided; in these cases, there was either a blank cell (exoplanets-csv) or a 'BLANK' string (exoplanets-with-blanks) which required sorting when used in the application.

Visualization Components:

Below is the initially-loaded application in action - though they are separated here, they're all accessible upon loading by scrolling up/down.

The view at this time is not yet updateable from the application itself (ie, clicking a bar on a barchart and the rest of the data adjusting to that data subset), however if it were, the updated view would be updated bars on all barcharts, as well as an updated linechart, histogram, scatterplot, and table corresponding to the newly-set subset. Some portions are able to be adjusted from within the code though.

As seen above, the histogram and scatterplot in particular are able to be manipulated further. The histogram can be adjusted to show a new number of bars (or bins) from its initialization in main.js. In the above examples, the updated bins value has been changed from 10 to 30 bars. In the scatterplot, the adjustability is primarily in its zoom capabilities. With over 5000 circles on one plot, it's cramped, even with a logarithmic scale. To accommodate for this, users can zoom in up to a 1:1000 scale, and as they do so the circles transition to smaller radii in order to continue fitting on-screen. Finally, as seen in the above two examples, but also applying to the entirety of the visualizations, there are tooltips to provide further information pertaining to the element over which the user is hovering. The categories contained in the tooltips differs between chart types to give the most relevant information possible.

Enabling Discovery:

Looking through all of this data and the results I ended up with were interesting, especially since I'm a huge fan of cool data-related things. Some of the most notable discoveries include the clearly defined trends in certain visualizations: for example, the spike of exoplanet discoveries in 2015, as well as the nearly 4000 exoplanets discovered via the transit discovery method. Another one is the seemingly linear correlation between radius and mass of planets, even in spite them being depicted logarithmically. Having tooltips available to show actual values was helpful in these cases to see exactly what was going on in that category, beyond just what was visible on the initial screen.

Code Process:

For this project, I didn't use outside libraries other than D3.js v6 (linked here). This was my first venture into front-end anything rather than my usual back-end skillset, so I wanted to keep things simple as I got accustomed. I set up one file for each

type of chart (barchart.js, histogram.js, etc), and in each of these I created an object constructor. Each one was initialized in main.js and fed their corresponding parameters - like "number of bins" for histograms. To run everything, I used the Visual Studio Code Live Server extension. This way I could simply right-click on my HTML file and select "Run with Live Server" to avoid extraneous issues. For all of my version control, I used my favorite branch visualizer, GitGraph. All of my 40+ commits and 9 different stashes can be seen in it! The link to my most recently updated code version can be found here.

Future Works:

With unlimited time, some of the changes I'd make or features I'd add would first include a working data filter/updater when selecting barchart bars. As seen in my barchart.js file, I attempted to do so, but overall did not have the time to get it to a functional state. The second major change I would implement would be an overhaul of my layout; again, there is an attempt at this visible in my HTML and CSS files, but in the same vein as the former example, it ended up being commented out in favor of getting everything else to work properly. Having everything visible on one screen without the need for scrolling would be a major improvement, as would be an adjustment to several axis and chart titles. Without prior HTML/CSS experience, this presented some trouble, but with some time I think it could be a significant improvement.

Demo:

Project 2 Documentation:

Application Motivation:

The overall motivation for this application was to take a very large data set of information on "311" calls in the Cincinnati area, and turn it into an interactive, more easily understood visualization. It breaks data down and shows things like where calls occurred, what days/weeks/months they occurred as well as their present status, what the call categorization happened to be, and much more based around information like this. Ultimately, it takes upward of several tens of thousands of lines of data and turns it into something not only useful, but also visually appealing and easy to comprehend.

Application Data:

The full data set used for this project can be found in this file and the processed version here. It's a tab-separated data set and contains the following data attributes, all of which have a reasonably self-explanatory meaning (ie, SERVICE_REQUEST_ID simply refers to the ID associated with that service request call):

SERVICE_REQUEST_ID
STATUS SERVICE_NAME
SERVICE_CODE
DESCRIPTION
AGENCY_RESPONSIBLE
REQUESTED_DATETIME
UPDATED_DATETIME
EXPECTED_DATETIME
ADDRESS
ZIPCODE
LATITUDE
LONGITUDE
REQUESTED_DATE
UPDATED_DATE
LAST_TABLE_UPDATE

Visualization Components:

Below is the body of the visualization upon loading it. The individual components include the interactive map, chart showing calls per day, calls per month divided by call status, and calls per week/day, which is adjustable in several ways as seen above it in the input bin start day selection as well as the "Show Weeks" versus "Show Days" button selectors.

In the above visualization, the main components of the visualization appear. These are comprised of a stacked bar chart to provide a deeper look into how calls are made up. Along with this, the map is shown with an interactive radio button selection; this determines what type of map view format is displayed on the viz, and may be updated repeatedly. In the lower bar chart, the user can set the bin start date and week/day option in order to view data more effectively depedning on the intended usage. Beyond the initial page, users can also view a more in-depth look into the time between call requests and their updated status by clicking on "Call Update Times." As seen below, this is interactive via a slider below it, which adjusts the cutoff thresholds shown on the chart.

In the above visualization, the main components of the visualization appear. These are comprised of a stacked bar chart to provide a deeper look into how calls are made up. Along with this, the map is shown with an interactive radio button selection; this determines what type of map view format is displayed on the viz, and may be updated repeatedly. In the lower bar chart, the user can set the bin start date and week/day option in order to view data more effectively depedning on the intended usage. Beyond the initial page, users can also view a more in-depth look into the time between call requests and their updated status by clicking on "Call Update Times." As seen below, this is interactive via a slider below it, which adjusts the cutoff thresholds shown on the chart.

Design Sketches & Justification:

Below is the initial, more general design sketch of what we wanted our overall visualization to look like. It has the overarching ideas shown - the largest two portions are designated into their own spots on the right side section, and a designated space is set for the additional charts required for the A and B Goals, respectively.

After setting up a "skeleton" for our overall visualization, we furthered the sketch and added in specifics, like which charts we'd be creating for the A and B Goals, and where each one would appear within the designated space. This is shown below.

As seen in this sketch, there are sections within the larger area on the left side in which space is reserved for the Header, vertical barchart, pie chart, horizontal bar chart, and additional visualization. Though we ended up tweaking this slightly to fit our progression in our code, this provided the overall structure for our visualization via sketching. Our reasoning behind this setup was to have our heaviest, largest viz at the top right corner, naturally drawing the eye. Around this main focal point laid the smaller chart which each offered additional insight into the map, and eventually its selected section of data via brushing. The size and placement of the charts within the left sidebar follow a natural progression of smaller to larger charts, allowing the eye to flow naturally down the section without being interrupted by unexpected changes in size or shape.

Enabling Discovery:

This application allowed us to see exactly where calls occured, what type of calls they were, and when they were typically happening. Some of the more surprising discoveries are visible in the Calls per Day chart; over the weekend, calls dropped noticeably compared to weekdays. This was particularly surprising since it would seem that when people have free time and are off of the typical workday/workweek, there would be more calls. However, this clearly was disproven. Another interesting discovery was in the additional view accessible via the Call Update Times button. An overwhelming portion of calls were updated within days, if not sooner. With thousands of calls being placed, this too was notable.

Process:

For this project, we primarily used D3.js v6, with the additional help of TypeScript to help make things more clear and LeafletJS to make our map creation more streamlined. Since we were now working as a group of four rather than alone, having some guidance on "what's what" was an important consideration and ultimately addition.

The structure of our code partially relied on the help of TypeScript utilization. We created our chart frameworks in individual files (ie, Barchart.ts, Table.ts, LeafletMap.ts, etc), and each one mapped to a corresponding d.ts, d.ts.map, js, and js.map file. These automatically updated based on what was provided in the code within the actual ts file, ultimately creating what appears onscreen.

To run the application, one can either run the startserver.sh file, or - as I personally did - utilize the VSCode Live Server extension, which enables users to simply run it straight off of the index.html file in the folder. The link to the overall code can be found here.

Demo:

Team Dynamic:

For this project, Scott took on the setup of the TypeScript functionality, as well as the map and histogram. Manish helped create the left sidebar's components and the charts within each one. Anthony helped with our B Goals and also fixed tooltips when we encountered an issue with them. I had some difficulty between sickness and travel in helping as much as I would have preferred, but I provided the overall project and skeleton of the setup, as well as the design choices associated with the color selection and modification on the map as well as the barchart interactivity.

"Data Vizualization in the Wild" Design Critique:

Visualization Source:

For the link to my Vizualization Source, click here. For the full data download (be prepared - it is relatively large), click here.

Critical Discussion:

If you prefer to view the original Powerpoint, feel free to click here. If not, my full critique is shown below:

Data Viz Critique

Data Viz Critique (1)

Data Viz Critique (9)

Data Viz Critique

1/9

Project 3 Documentation:

Application Motivation:

This application's intent was to find the script of a TV show containing 10+ recurring characters over the course of several seasons, and turn the information contained within those scripts into a more easily digestable format. Some of the specific tasks to be achieved included viewing the most commonly spoken words by character and seeing how often each character spoke in total or by season. This essentially turns hours' worth of watch-time into something simple and able to be understood even by those unfamiliar with the show's contents.

Application Data:

This project required us to find and choose a show with either pre-made data sets, or parsable/scrapeable scripts which name the character speaking at that moment. I chose The Last Kingdom and sourced my scripts through this website. Cleaning and prepping the scripts themselves for usage was a challenge; I'd never used Python for anything beyond basics, so even figuring out how to collect the individual links to scripts from the main forum page was something new.

Above is my original function used to collect said links, as well as scrape the text from each one and put it into a text file, 'data.txt'. To achieve this I learned about Pandas and BeautifulSoup, two new tools to me. After that was completed I had to clean my data, which primarily included identifying which bracket sets should be preserved, like '[Edward]', and which could be removed, like '[waves crashing]'. An additional function to removed instances of sound effect brackets took care of this, along with some minor manual removals for unusual instances of bracket usage. Once the file was cleaned and ready to split into a csv file for more efficient usage, I used Pandas' DataFrame function to collect each speaker instance, the coinciding dialogue, and the season and episode in which it occurred; this was all placed into a csv file with matching columns.

Above is an example of the first 2/895 instances of this.

Visualization Components:

Below is my original plan for my visualization, as well as the components which I ended up completing - a barchart, chord diagram, and word cloud.

In addition to these three components, I also included a dropdown selection for Seasons 1 through 5, as well as an option to reset and include the show in its entirety. Though this didn't end up working in this timeframe, its intent was to update each component and filter the data shown to include only that season's portion. The barchart shows how many instances of dialogue each of the top named characters spoke; the word map shows the overall most commonly spoken words (words like "i" and "the" have been filtered out); and the chord diagram shows how commonly each character references another. The color scheme shown correlates to the ethnicity of the character named: peach is for Saxons, green is for the singular Irish character in the entire show, teal is for Mercians, blue is for Danes, and purple is for those of both Danish and Saxon heritage. This too would've been implemented on the barchart eventually as well.

Enabling Discovery:

The main discovery of this show is how much it objectively centers around the main character, Uhtred Ragnarsson/Uhtred of Bebbanburg. Having watched the show, I know that a major plotline is that the formation of England doesn't include all those who helped make it happen; Uhtred is the character who represents those unseen characters from the real-life events represented in-show. However, I didn't anticipate just how much he spoke. Approximately a third of his character references were to himself (perhaps because of his catchphrase "I am Uhtred, son of Uhtred" spoken at least twice per episode?), and "Uhtred" is the second most-spoken word of the show after filtering out mundanities. Tooltips better exemplify this, particularly in the chord diagram.

Code Process:

For this project, I used D3.js v6 with the help of Pandas and BeautifulSoup to collect and organize my scripts into usable data. With further work I would have included an additional plug-in for the Sankey diagram. To run the application, I used the same process as my Project 1, by utilizing VSCode's Live Server extension. The overall code can be found here.

Demo: