GAFAM Empire

Methodology

We provide the complete methodology on how the project was conducted: how data was collected, manipulated, interpreted, and finally visualized. The project was developed as an experimental exploration on how to deal with data that describe company acquisitions, and how much tacit information we can derive from its analysis and visualization.

What are we looking for?: Diving into a gray area

GAFAM stands for the five largest, most dominant, and most prestigious companies in the information technology industry of the United States as of 2022: Alphabet (Google), Amazon, Apple, Meta (Facebook) and Microsoft.
GAFAM operates in a gray area in which it is difficult to state whether they are competing or not, or whether there are monopoly issues without us realizing it. From 1975, Microsoft has acquired more than 357 companies. This project looks at the relationships between the companies acquired by GAFAM, shifting from the already seen money-approximation toward looking closely at each company definition. There is no complete and transparent information available on how much money the GAFAM companies invest and spend. We cannot know with certainty the motivations for why they buy other companies and start-ups. But we can observe what they buy and how each company describes its business and area of interest. In the attempt to grab a picture of their interests, hopes, and beliefs, our approach recognizes and accepts that the information we have is incomplete: we are diving into a gray area.

Data collection

The project uses Crunchbase as a source: a platform for finding business information on private and public companies. The information on profiles for individuals, organizations, or institutions are sourced in four ways: the venture program, machine learning, an in-house data team, and the Crunchbase community. Members of the public can submit information to the Crunchbase database. These submissions are subject to registration, social validation, and are often reviewed by a moderator before being accepted for publication.
The following image shows some of the information describing each company that we use in the project.

Image showing how each field of the Crunchbase web page is used in building the dataset.

The information was collected for the 1,210 companies that have been acquired by GAFAM over the years, ranging from Microsoft's acquisition of Forethought (which became PowerPoint) in 1987 to Apple's purchase of an aerospace transportation company that develops electric aircrafts, Joby Aviation, in the first days of April 2022. Companies listed in the dataset were assigned tags describing their business and work. For example, Joby Aviation is tagged "Aerospace”, “Air Transportation”, “Electric Vehicle”, and “Transportation". Also, relevant data was collected for tracing back the acquisition pipeline that the GAFAM companies have made over the years. For example, Tavolo is a company that was acquired in 2000 by OurHouse, which was later acquired by Amazon the following year.

Company Parent company Type Date of acquisition Operating status Location Description
Forethought Microsoft acquisition 30/07/1987 Active Sunnyvale, California, United States Forethought, Inc. is a computer software company that developed PowerPoint, a slide-based presentation program.
OurHouse Amazon acquisition 01/12/2001 Active Decatur, Georgia, United States OurHouse is an online retailer that provides its clients with hardware products and advice from its official online platform.
Tavolo OurHouse acquisition 12/12/2000 Active Minneapolis, Minnesota, United States Tavolo is an online retailer and destination for consumers who cook, dine and entertain.
Joby Aviation Apple acquisition 01/04/2022 Active Santa Cruz, California, United States Joby Aviation is an aerospace transportation company developing electric aircrafts.

In this project we do not use geographic locations of companies nor monetary values related to the purchase and sale of the companies nor to their profit or investment since our focus is on the description of what is purchased.

Data manipulation and visualization

In order to see the relationships that exist among the companies acquired by the GAFAM companies, we first try to see the general landscape of where these companies are located according to their businesses. To do so, we used the tags that define them, attempting to group the companies according similarities in how they are defined. However, given the lack of uniformity of the different 390 tags that define all the companies, it was necessary to organize them into categories that would allow us to group them together.
To do so, we built a hierarchical schema grouping tags into categories, from the finest to the broadest.

We've embedded content from Observable here. As Observable may collect personal data and track your viewing behaviour, we'll only load this content after you consent to their use of cookies and similar technologies as described in their privacy policy.

For example, Handipoints is "a website with a chore chart component and a virtual world containing a range of online games and activities for children" acquired by Slide in 2009 and then by Google in 2010. Its tags are Casual games, Gaming and Virtual games, resulting in Gaming as a broad tag from our hierarchical schema.Schema explaining how the tags were aggregated into bigger categories.

Landscape of acquisitions

Once we organize the tags, we elaborate a map that positions each company according to the similarity of the combinations of tags. We use a model called UMAP, which results in cluster coordinates that allow us to see the landscape of these companies.
The next step was to read the companies' clusters and understand their business and interests. We observed that in many cases, the tags describing the companies were not sufficient for understanding the company business. For example, Neven Vision was acquired by Google in 2006. It is described with the tag Software as a “developer of mobile recognition engines for mobile phone and consumer electronics industries, and more”. After further research, we found that Neven Vision was a biometric and photo recognition company. It has patents on technology ranging from photo analysis to face recognition in video files to several patents for facial capture for avatar animation. Sounds like a fascinating partnership. The company is heavily focused on mobile phones and also offers a product to deliver coupons to mobile devices. Thus, the tag "software" is changed to "biometrics".
After researching each company, we noticed that it is difficult to classify them based on the technologies they develop, since in many cases there is no information that accounts for them, and on the other hand, many of these technologies are used in different sectors. For example, biometric technology such as speech recognition can be used in sectors as diverse as health (Dictaphone) or security. In this way, this project classifies each company with a technology and a sector in which it is possible to recognize them. This project proposes an overview of 15 sectors and 10 types of technologies. Color is used to encode the sector and technology throughout the project:

GAFAM Petri dishes

After viewing all the companies in the same landscape, the "petri dishes" model aggregates them according to each GAFAM company that purchased them. We represent the interests of each GAFAM company by grouping the companies by sector and using color to describe the technology. The petri dishes show the accumulation of all companies that were purchased by a GAFAM company over time. Companies that were acquired and then sold are not included in the petri dish, and companies acquired through the acquisition of another company are included only when the latter was acquired.

We've embedded content from Observable here. As Observable may collect personal data and track your viewing behaviour, we'll only load this content after you consent to their use of cookies and similar technologies as described in their privacy policy.

Drag the bubbles. Companies belonging to the same cluster are encircled.
The visualization uses a force-based layout that clusters single nodes based on a categorical scale (in this case, the sector each company belongs to), which are then colored by their technological affinity. To better identify groups, all companies belonging to the same sector are encircled using a Convex Hull.

Timelines of expansion

The timeline visualizes the chain of company purchases in relation to each GAFAM company to scrutinize the relationships of interests over time. Acquisitions and sales can be a clue to get closer to understanding the motivations and interests of GAFAM companies. For example, Powerschool is a company founded in 1997 that started buying other start-ups and companies in relation to its interest in the education sector. In 2001 Apple buys Powerschool and continues to buy other companies in the sector. Finally, Apple sells Powerschool to Onex in 2022. This chain of acquisitions and sales is not linear and deserves to get a closer look as to when they happened and which sector and technologies are involved.

The visual model combines a hierarchical structure, usually visualized using dendrograms, or tree-like structures, and a time-based visualization, by placing nodes in the year of their acquisition by GAFAM. The result is a time-based hierarchical network structure that showcases the intricacies of the relationships between GAFAM and acquired companies.

The visual model combines a hierarchical structure, usually visualized using dendrograms, or tree-like structures, and a time-based visualization, by placing nodes in the year of their acquisition by a GAFAM company. The result is a time-based hierarchical network structure that showcases the intricacies of the relationships between GAFAM and acquired companies. Companies that were sold by one GAFAM company to another, however, are visualized but only as sold companies.

We've embedded content from Observable here. As Observable may collect personal data and track your viewing behaviour, we'll only load this content after you consent to their use of cookies and similar technologies as described in their privacy policy.

A company's complete dendrogram shows all acquisitions from its initial founding. Red connections show companies that have been acquired and closed after some time, while green ones show companies that were later sold. Dashed connections show how an acquisition included previous acquisitions by that company. To build this hierarchical structure, we filter the complete dataset by the ultimate acquisition, and stratify it using d3.js to obtain a data structure that can be used to build a dendrogram.

We've embedded content from Observable here. As Observable may collect personal data and track your viewing behaviour, we'll only load this content after you consent to their use of cookies and similar technologies as described in their privacy policy.

To conclude, the methodology used in this project acknowledges the partiality of the data and the analysis performed. This is due to the overall amount of data collected, the source of data and the timeframe in which it was first compiled. Indeed, many acquired companies, especially acquisitions that happened many years ago, are missing information from the companies themselves - on the other hand, many small companies that were recently acquired don't have a lot of information available due to their small size. With this in mind, we provide the entire dataset used to produce the visualizations and writings in this website.

We've embedded content from Google here. As Google may collect personal data and track your viewing behaviour, we'll only load this content after you consent to their use of cookies and similar technologies as described in their privacy policy.