Statistical physics of the Schelling Model of Segregation


Overview

In 1971, the American economist Thomas Schelling created an agent-based model that might help explain why segregation is so difficult to combat. His model of segregation showed that even when individuals (or "agents") didn't mind being surrounded or living by agents of a different race, they would still choose to segregate themselves from other agents over time. Although the model is quite simple, it gives a fascinating look at how individuals might self-segregate, even when they have no explicit desire to do so.

The model was formulated in a social science context, but it has direct analogues to physical systems, such as in the mixing of liquids in an emulsion and the self-assembly of crystalline structures on metal surfaces. Emulsions are immiscible (incapable of mixing) colloidal suspensions of one liquid in another liquid. Emulsions will separate into their individual components if allowed to sit for long enough. Some examples of emulsions include: face creams, soapy water and salad dressings made by shaken oil and vinegar.

Schelling's Model

Here is a simple variation on Schelling's model. Suppose there are two types of agents: X and O. The two types of agents might represent oil and water droplets in a colloidal suspension. Two populations of the two agent types are initially placed into random locations of a neighborhood represented by a grid. After placing all the agents in the grid, each cell is either occupied by an agent or is empty as shown below.

Now we must determine if each agent is satisfied with its current location. A satisfied agent is one that is surrounded by at least $t$ percent of agents that are like itself. This threshold $t$ is one that will apply to all agents in the model, even though in the social science example, the "agents" might each have a different threshold they are satisfied with. Note that the higher the threshold, the higher the likelihood the agents will not be satisfied with their current location.

For example, if $t$ = 30%, agent X is satisfied if at least 30% of its neighbors are also X. If fewer than 30% are X, then the agent is not satisfied, and it will want to change its location in the grid. For the remainder of this explanation, let's assume a threshold $t$ of 30%. This means every agent is fine with being in the minority as long as there are at least 30% of similar agents in adjacent cells.

The picture below (left) shows a satisfied agent because 50% of X's neighbors are also X (50% > $t$). The next X (right) is not satisfied because only 25% of its neighbors are X (25% < $t$). Notice that in this example empty cells are not counted when calculating similarity.

When an agent is not satisfied, it can be moved to any vacant location in the grid. Any algorithm can be used to choose this new location. For example, a randomly selected cell may be chosen, or the agent could move to the nearest available location.

In the image below (left), all dissatisfied agents have an asterisk next to them. The image on the right shows the new configuration after all the dissatisfied agents have been moved to unoccupied cells at random. Note that the new configuration may cause some agents who were previously satisfied to become dissatisfied!

One way of proceeding is to force all dissatisfied agents to move in the same round. After the round is complete, a new round begins, and dissatisfied agents are once again moved to new locations in the grid. These rounds continue until all agents in the neighborhood are satisfied with their location.

Alternatively, the agents can be scheduled to move in a random order, or in an order that sweeps downward along rows of the grid; they can move to the nearest location that will make them satisfied or to a random one.

There also needs to be a way of handling situations in which an agent is scheduled to move, and there is no cell that will make it satisified. In such a case, the agent can be left where it is, or moved to a completely random cell.

Note: a cell’s neighbors are the cells that touch it, including diagonal contact; thus, a cell that is not on the boundary of the grid has eight neighbors.

Program Implementation

For this project, you will implement Schelling's model using iPython widgets to create an interactive tool for exploring the properties of the system under different conditions. Your program should be able to generate time-series data on the evolution of the system (average dissatisfaction, clumping patterns, etc.) that can be analyzed using statistical physics models such as those described in the two papers referenced at the beginning of the project description.

Once you have a working model, you will receive a set of questions to answer using the model that will form the basis of your demo presentation during the final exam.

Here are some example questions you could be asked:

  • Map out how the satisfaction threshold $t$ affects the average satisfaction and amount of "clumping" as a function of time. i.e. How many rounds does it take for a neighborhood to completely sort itself ("equilibrate") for a given value of $t$?
  • What happens when one population is much larger than the other, say 75/25?
  • What happens if the neighborhood is pre-seeded with clumps of a certain size?

Your code should allow for exploring the results when varying the similarity threshold, the population density, the starting relative concentration of agents, the size of the system, etc.

You will also be asked to produce plots similar to those in the papers and try to reproduce some of the phase transition behavior observed. Keep this in mind when designing your code. It should be well-documented and modular enough to allow you to adapt it to different situations.

Note: The exercises we did using ipythonblocks (like Battleship) will come in handy. Review your code to help you get started.


All content is under a modified MIT License, and can be freely used and adapted. See the full license text here.