The Problem with Data Modeling (as commonly practiced)

alexblogimage
Every mature MVC application has that one model that’s grown out of control. The table has 20 columns; there are user preferences stored in it alongside system information; it sends email notifications and writes other models to the database. The model encompasses so much application logic that any new feature is likely to have to go through it. It’s the class that makes developers groan whenever they open it up.
How did we get here?
Web applications usually start out with a single purpose, to display a single type of data – maybe it’s Article for an online journal or ClothingItem for a fashion retailer. It’s common for MVC practitioners to take concepts from their product whiteboarding sessions and directly translate them into database models. So we start out with a database model that represents a central concept in the application, and as more business requirements emerge, the cheapest way to accommodate them is to add columns to the existing model. Carry this out over time and you end up with a God Object.
The problem from the start was taking the casual, loose concepts sketched out at the product level and putting them in the codebase. Software engineers should be well aware that the concepts people use in everyday life and thinking are terribly imprecise and loaded with implicit assumptions. Highlighting implicit assumptions is often a software engineer’s key contribution, so it’s a wonder we take these concepts from the product level and embed them in our code. It’s just asking for hidden edge cases to need clarifying logic shoved into the existing class.
The concept of Folk Psychology is illuminating here. Folk Psychology refers to the innate, loosely specified theories that people have about how other humans operate that they use to infer motivations and predict behavior. These “folk” theories work well enough in the context of everyday human life, but are not scientifically rigorous and contain blindspots. Similarly, people make use of “folk object models” in software businesses. These are the informal concepts people construct to discuss software with other humans – the words product managers use with software engineers, the boxes are drawn on the whiteboard. They work well enough when discussing concepts with other humans, who can be generous in their interpretations but can fall apart when formalized as code. These concepts are a useful starting point to frame the product features, but from an OO perspective are too broad to be used as classes. They tend to accumulate logic since they implicitly encompass so much of the problem domain.
Much as the first obstacle people have to overcome when learning to code is to take their thoughts and explicitly formulate them as steps in an algorithm, experienced software engineers need to take folk object models and break them down into explicit components that can be used as classes. In the product domain, we may start with a broad “User” concept, but as we dig deeper we’ll discover different pieces of logic that would be better served as separate classes- a billing preference, a current status, or notification settings. Each of those will require their own logic to meet product requirements, and if we don’t separate them out to make space for the logic, we’ll incur bloat.
People often think that data modeling is about encoding the business concepts in software, but really it’s about using model classes as tools to construct a system. Often codebases are better served when large models are broken into components that each address a specific piece of domain logic.

Adam McKenzie

As CTO, Adam is responsible for managing the HPC and customer success teams. Adam began his career at Boeing, where he spent seven years working on the 787, managing structural and software engineering projects designing, analyzing, and optimizing the wing. Adam holds a B.S. in Mechanical Engineering cum laude from Oregon State University.

View all posts

Thoughts on Using Tachyons for Rescale.com’s Redesign

Rescale Engineering August 11, 2016February 9, 2023

A little more than a month ago we shipped a redesigned version of our homepage, https://rescale.com. During its development, we thought it would be a…

English

Lightweight Azure InfiniBand Cluster Setup

Ryan Kaneshiro May 15, 2014March 21, 2023

One of the key criticisms leveled against HPC in the cloud is the relatively slow interconnect speed between nodes when compared to on-premise clusters. While…

English

Rescale Announces Technology Partnership with MSC Software

Ilea Graedel January 14, 2014October 25, 2023

San Francisco, CA – Rescale is pleased to announce they have recently joined MSC Software’s Technology Partner Program. Rescale has certified the compatibility of three…

English

Quick Tip: Working Directory Snapshot

Adam McKenzie November 10, 2014March 7, 2023

It can be useful to take periodic snapshots of the working directory for a run of your Rescale job, either for restart purposes or for…

English | Thought Leadership

Fireside Chat with Sam Altman

Rescale Sales February 24, 2020March 22, 2023

Rescale, alongside sponsors AWS, Intel, Microsoft, Google Cloud, ANSYS, Siemens and Convergent Science, kicked off the inaugural Big Compute conference Feb 11-12th in San Francisco….

English

AGC unleashes potential with HPC in the Cloud

Robert Combier October 30, 2018January 25, 2023

AGC is one of the world’s largest glass and ceramics producers. Background and Challenges Founded in 1907, AGC is the global leader in glass, ceramic,…

Cookie	Duration	Description
AWSALBCORS	7 days	This cookie is managed by Amazon Web Services and is used for load balancing.
cookielawinfo-checkbox-advertisement	1 year	Set by the GDPR Cookie Consent plugin, this cookie is used to record the user consent for the cookies in the "Advertisement" category .
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Cookie	Duration	Description
__cf_bm	30 minutes	This cookie, set by Cloudflare, is used to support Cloudflare Bot Management.
bcookie	2 years	LinkedIn sets this cookie from LinkedIn share buttons and ad tags to recognize browser ID.
lang	session	LinkedIn sets this cookie to remember a user's language setting.
lidc	1 day	LinkedIn sets the lidc cookie to facilitate data center selection.
player	1 year	Vimeo uses this cookie to save the user's preferences when playing embedded videos from Vimeo.

Cookie	Duration	Description
AWSALB	7 days	AWSALB is an application load balancer cookie set by Amazon Web Services to map the session to the target.
sync_active	never	This cookie is set by Vimeo and contains data on the visitor's video-content preferences, so that the website remembers parameters such as preferred volume or video quality.

Cookie	Duration	Description
_ga	2 years	The _ga cookie, installed by Google Analytics, calculates visitor, session and campaign data and also keeps track of site usage for the site's analytics report. The cookie stores information anonymously and assigns a randomly generated number to recognize unique visitors.
_gat_UA-32985745-1	1 minute	A variation of the _gat cookie set by Google Analytics and Google Tag Manager to allow website owners to track visitor behaviour and measure site performance. The pattern element in the name contains the unique identity number of the account or website it relates to.
_gcl_au	3 months	Provided by Google Tag Manager to experiment advertisement efficiency of websites using their services.
_gid	1 day	Installed by Google Analytics, _gid cookie stores information on how visitors use a website, while also creating an analytics report of the website's performance. Some of the data that are collected include the number of visitors, their source, and the pages they visit anonymously.
CONSENT	2 years	YouTube sets this cookie via embedded youtube-videos and registers anonymous statistical data.
utm_campaign	past	Google Ad Services sets this cookie to store session campaign value if present.
utm_content	past	This cookie is used for storing the session content value if present.
utm_source	past	This cookie is used to record from where the visitor came to the website orginally. This information is used by the website operator to know the efficiency of their marketing.
utm_term	past	This cookie is used to record from where the visitor came to the website orginally. This information is used by the website operator to know the efficiency of their marketing.
vuid	2 years	Vimeo installs this cookie to collect tracking information by setting a unique ID to embed videos to the website.

Cookie	Duration	Description
_fbp	3 months	This cookie is set by Facebook to display advertisements when either on Facebook or on a digital platform powered by Facebook advertising, after visiting the website.
_mkto_trk	2 years	This cookie, provided by Marketo, has information (such as a unique user ID) that is used to track the user's site usage. The cookies set by Marketo are readable only by Marketo.
fr	3 months	Facebook sets this cookie to show relevant advertisements to users by tracking user behaviour across the web, on sites that have Facebook pixel or Facebook social plugin.
IDE	1 year 24 days	Google DoubleClick IDE cookies are used to store information about how the user uses the website to present them with relevant ads and according to the user profile.
personalization_id	2 years	Twitter sets this cookie to integrate and share features for social media and also store information about how the user uses the website, for tracking and targeting.
test_cookie	15 minutes	The test_cookie is set by doubleclick.net and is used to determine if the user's browser supports cookies.
utm_medium	past	This cookie is used to record from where the visitor came to the website orginally. This information is used by the website operator to know the efficiency of their marketing.
VISITOR_INFO1_LIVE	5 months 27 days	A cookie set by YouTube to measure bandwidth that determines whether the user gets the new or old player interface.
YSC	session	YSC cookie is set by Youtube and is used to track the views of embedded videos on Youtube pages.
yt-remote-connected-devices	never	YouTube sets this cookie to store the video preferences of the user using embedded YouTube video.
yt-remote-device-id	never	YouTube sets this cookie to store the video preferences of the user using embedded YouTube video.
yt.innertube::nextId	never	This cookie, set by YouTube, registers a unique ID to store data on what videos from YouTube the user has seen.
yt.innertube::requests	never	This cookie, set by YouTube, registers a unique ID to store data on what videos from YouTube the user has seen.

Cookie	Duration	Description
_chtbl	session	No description available.
_dtses	30 minutes	No description available.
_dtuid	10 years	No description available.
BIGipServersj30web-nginx-app_https	session	No description
email	past	No description available.
gclid	past	No description
handl_ip	1 month	No description available.
handl_landing_page	1 month	No description available.
handl_original_ref	past	No description available.
handl_ref	past	No description available.
handl_url	1 month	No description available.
li_gc	2 years	No description
muc_ads	2 years	No description
username	past	No description available.

Rescale Platform

Overview

HPC & AI Software

HPC & AI Architectures

Security & Compliance

Ecosystem Integrations

Pricing

HPC as a Service

Intelligent Batch

Elastic Cloud Workstation

Storage Fabric

Enterprise Management

Multi-Team Management

Performance Management

Software Publisher

Digital Engineering

AI Physics

Knowledge Management

Computational Pipelines

Author

Similar Posts

Newsletter Sign Up

Rescale Platform

Overview

HPC & AI Software

HPC & AI Architectures

Security & Compliance

Ecosystem Integrations

Pricing

HPC as a Service

Intelligent Batch

Elastic Cloud Workstation

Storage Fabric

Enterprise Management

Multi-Team Management

Performance Management

Software Publisher

Digital Engineering

AI Physics

Knowledge Management

Computational Pipelines