Purposes, concepts, misfits, and a redesign of Git

Purposes, concepts, misfits, and a redesign of GitDe Rosso & Jackson OOPSLA ’16

It’s OOPSLA in just a few weeks time, and ‘Purposes, concepts, misfits, and a redesign of Git’ will be presented there, bringing the story we looked at yesterday  up to date. The subject matter is the same of course – looking at the extent to which Git’s conceptual model causes confusion in its interface – but the thinking framework around concepts and concept modelling has matured, and the coverage of Git’s model is more comprehensive. There is more validation too – an analysis of 2400 Stack Overflow questions relating to Git (as of July 18th 2016), and a user study with existing Git users.

I have mixed feelings about that study. The tasks that the users were asked to do, and which are documented in Appendix A, are really quite interesting in the way they reveal some of Git’s corner cases (worth a read). But the total user study involved only 11 people, and they each interacted with the tools for one hour.  11 user tests lasting one hour each seems a very small sample indeed for a project that has been running for 3 years now. That’s enough to draw some qualitative conclusions on the version of the tool used at the time, but not enough to produce any quantitative results (‘How many test users in a usability study?‘. I think we can say at this point that the analysis of the difficulties with Git and how they tie back to concepts seems sound. Yet the question we really want answered – does a redesign based around a simpler, cleaner conceptual model produce a tool that is substantially easier to use –  remains open. We have an intuition that it does, and some early results to back that up,  but nothing of statistical significance.

The authors found a significant polarisation of reactions to their work, which group are you in?

In sharing our research with colleagues… we have discovered a significant polarization. Experts, who are deeply familiar with the product, have learned its many intricacies, developed complex, customized workflows, and regularly exploit its most elaborate features, are often defensive and resistant to the suggestion that the design has flaws. In contrast, less intensive users, who have given up on understanding the product, and rely on only a handful of memorised commands, are so frustrated by their experience that an analysis like ours seems to them belabouring the obvious.

A Conceptual design framework

Conceptual design is concerned with the selection and shaping of the essential concepts in a system. This is in contrast with representation design which concerns how to embody those concepts in the code.  The authors’ theory of conceptual design has evolved to include a study of motivating purposes and operational principles.

A concept is something you need to understand in order to use an application… and is invented to solve a particular problem which is called the motivating purpose.  A concept is defined by an operational principle, which is a scenario that illustrates how the concept fulfills its motivating purpose. The operational principle is thus a very partial description of the behavior associated with the concept, but focused on the particular aspect of behavior that motivated the introduction of the concept in the first place.

My interpretation: there are a number of goals/tasks the end user has, and the purpose of the system is to support them in achieving those goals. If a concept is necessary to fulfil a given purpose, then that purpose is its motivating purpose – the reason we need the concept. Each concept should be accompanied by a scenario that illustrates how that concept is necessary to fulfil its motivating purpose.

For example, Mac OS has the concept of a trash can. The motivating purpose of the trash can is enabling users to recover deleted files. The operational principle is that any time a file is deleted it is not permanently removed, but instead placed in a special trash folder from which it can be restored until the trash is emptied.

A concept may not be entirely fit for purpose. In that case, one or more operational misfits are used to explain why. The operational misfit usually does not contradict the operational principle, but presents a different scenario in which the prescribed behavior does not meet a desired goal.

In other words, operational misfits serve to highlight weaknesses in the current conceptual design. You often don’t find them until later in the process. “A good design process, however, should aim to identify misfits as early as possible.”

How do you find misfits early? The authors recommend analysing a design along five dimensions:

  • Concept Motivation – each concept should be motivated by at least one purpose. Prune concepts which only arise from implementation concerns.
  • Concept Coherence – each concept should be motivated by at most one purpose. Concepts motivated by multiple purposes struggle to fulfil any one purpose well.
  • Purpose Fulfilment – each purpose should motivate at least one concept. If not, a purpose representing a real need will not be fulfilled.
  • Non-division of purposes  – each purpose should motivate at most one concept. When the same purpose motivates different concepts in different contexts, confusion tends to arise.
  • Decoupling – concepts should not interfere with one another’s fulfilment of purpose.

Which seems to boil down to: there is a one-to-one mapping between concepts and purposes, and concepts should be orthogonal. I’d like to see a few worked examples to explore e.g. the rule that a purpose cannot motivate more than one concept.

Let’s take a look at these ideas in the context of Git.

Operational Misfits in Git

We looked at some of these yesterday. See §3 for more details.

  • Saving changes in the middle of a long task without creating an incomplete commit.
  • Switching branches when you have uncommitted changes
  • Inadvertently ending up with a ‘detached head’ when trying to go back to an old commit
  • Renaming files that also contain significant other changes
  • Creating and adding a new file, working on it, and then committing it – this will commit the original version you added, not the latest
  • Removing a previously tracked file from tracking
  • Creating an empty directory (requires the creation of a token file)

Concepts and Purposes in Git

Why do we have version control? The authors identify six top-level purposes:

  1. To make a set of changes persistent
  2. To group logically related changes
  3. To record coherent points in the development
  4. To synchronize changes among collaborators
  5. To support parallel development
  6. To work in disconnected mode

According to their own rules therefore, we should have exactly six concepts as well.

A good concept should have a compelling purpose: that purpose is what allows users to grasp the concept, and allows developers to focus on designing the concept to fulfill that purpose. In contrast, the conventional view is that the concepts of an application fulfill one or more purposes only in aggregate; there need be no rationale for the design of any single concept except that it plays some role in the larger whole. Our view is that this amorphous relationship between concepts and purposes is what has hindered the kind of design analysis we are attempting here, and that the approach of assigning purposes to concepts not only immediately highlights some discrepancies, but also provides a decomposition that makes deeper analysis possible. In particular, when concepts seem to have no direct mapping to a given purpose, their motivation is questionable, and one might wonder whether they are needed at all

What set of concepts does Git have?

  • Repositories
  • Commits
  • Working Directory
  • Staging Area
  • Stash
  • References
  • File classifications

Which we can map to purposes as follows:

(Click for larger view)

The misfits can be explained in terms of that mapping:

  • The concept of commit is connected to two purposes (P1 and P2), and this tangling causes the ‘saving changes’ misfit.
  • The working directory, staging area, and branch concepts are not decoupled, this causes the branching misfit.
  • The stash concept turns out not to be motivated by any purpose.
  • The staging area and file classification concepts are coupled, leading to problems creating, adding, and removing files.
  • The difficulty in removing a previously untracked file from tracking stems from a single purpose (prevent committing of untracked files – yes, that wasn’t in their list of purposes…) motivating two concepts: ignored files, and untracked files.
  • The file rename and empty directory misfits are violations of the fufillment requirement – they seem to point to a missing concept.

Gitless revisited

As we saw yesterday, Gitless has a smaller number of concepts. I really wish the authors had provided the new purpose -> concept mapping for Gitless, but as it is we’re going to have to try to recover that for ourselves. The purposes remain unchanged of course. The concepts in Gitless are:

  • Repository
  • Commit
  • Working Directory
  • Branch
  • File Classification

The stash and staging area are gone, there are a reduced number of file classifications, the concept of branch is redefined and  the notion of a ‘current branch’ is introduced.  See yesterday’s write-up and the Gitless website for more details.

Misfits ‘file tracking’ and ‘untracking a file’ are prevented by the elimination of the ‘assume unchanged’ classification and the staging area. The switching branches misfit is prevented by the redefinition of the branch concept and the elimination of stashes. The current branch notion and redefinition of head prevents the detached head misfit.

How do we complete this diagram though?

To be honest, I’m not sure. And yet this would seem to be crucial to the authors’ methodology.