Puppet Tutorial for Beginners

What is Puppet?

Puppet is nothing but configuration management tool — puppet used for deploying, configuring and managing the servers. Puppet enables the system administrators to work faster and smoother with the help of automation.

Puppet Overview

  • A puppet is an open source software, and it is a configuration management tool.
  • Puppet runs on Unix-like, Linux and Windows operating system.
  • Puppet software produced by Puppet (Privately held IT automation software-company located in Portland, Oregon).
  • Puppet founded by Luke kanies, in 2005.
  • Puppet is written in C++ and Clojure.
  • Puppet released as free software under GNU GPL (General public License) until version 2.7.0
  • The user describes system state of the resources and resources, either using puppet’s declarative language or Ruby DSL (domain specific language).

What exactly Puppet does?

  • Puppet defines distinct configuration to every host. After setting the configuration, puppet keeps on checking continuously, whether the required configuration is at the right place and confirms that the configuration not altered. In case, if the configuration modified then puppet revert the host to the necessary configuration.
  • Puppet scale-ups and scale-downs the machines dynamically.
  • Puppet provides control over all the configured machines, and so the centralized change propagated to all automatically.

Features of Puppet

There are many Puppet features as follow:

  1. Large Installed Base

The puppet used by more than 30,000 companies worldwide including Google, semen’s, and Red Hat, etc. the universities like Harvard law school, Stanford are using Puppet software. Almost 22 new organizations per day use Puppet for the first time.

2. Large Developer Base

Puppet has many contributors to its core source code.

3. Long Commercial Track Record

Puppet has used commercially, since 2005. Puppet keeps on improving and refining. Puppet has deployed in an extensive infrastructure (machines 5000+), scalability and performance lessons learned from such projects and applied to puppet by the Puppet developers.

4. Documentation

Puppet has a website with hundreds of pages of documentation maintained by the community of users, adding content and modifying it, is allowed to the user who is part of that community. (Website) It also contains comprehensive references for both languages and resource types. Besides, it is easy to find out the solution for your puppet problem because multiple mailing lists discussed here actively. It has a popular IRC (Internet Relay Chat) channel. IRC is an application layer protocol that provides ‘communication in the form of text.’ Client-server network model used to work the chat process.

5. Platform Support

Puppet runs on that operating system, which supports Ruby. Example are windows, Linus, CentOS, etc. It runs on new as well as outdated operating system and runs on Ruby versions too.

To go into the details of Puppet, first understand the concept of configuration management, need of configuration management.

Configuration Management

Usually, System Administrators perform repetitive tasks like installation of servers and configuration of servers, etc.

System admin can automate those tasks by writing the script for them. It will be useful in case of a small organization (having small infrastructure) whereas, in the case of organizations having large infrastructures, it becomes a tedious job to write the script.

Configuration management is the solution to such problems.

Configuration management is the practice of handling changes systematically in the system so that system maintains its integrity over time.

For project management and the audit process, configuration management allows access to the accurate historical record of the system state.

Configuration management overcomes the following challenges:

  • To figure out which component to change, as per the requirements.
  • Need for reimplementation as the requirement changes since the last implementation.
  • If any component replaced with the new but flawed version, then need to revert the previous version.
  • Need to replace the wrong component because admin could not determine which component should replace.

Let us understand the importance of Configuration Management through Use Cases:

The best example to explain here is the NYSE, i.e. New York Stock Exchange. A ‘software malfunction/glitch’ has prevented the New York Stock Exchange from trading stocks for almost 90 minutes. This leads to a loss of millions of dollars. The new software installed on ten terminals out of 24 trading terminals. Those systems were tested the night before. Whereas in next morning, ten terminals failed to operate correctly.

The new software installation put the NYSE into this problem. There is a need to uninstall the latest software from terminals and switch to the old one. In this scenario, possibly to you may think that the configuration management process of NYSE gets fail to manage it. Configuration management process recovers the NYSE in 90 minutes only, which is very fast.

Essential concepts of ‘Configuration Management’

Configuration Management has different interdependent activities. Before that, let us see, what is configuration Item (CI). A configuration item is any service component or infrastructure element or any component that needs to manage in order to ensure the successful delivery of services. Individual documents, models plans, software’s are examples of configuration item.

Following is the list of configuration management process or activities:

  1. Configuration Identification
  2. Change Management
  3. Configuration Status Accounting
  4. Configuration Audits

Configuration Management Let us see each process in detail:

  1. Configuration Identification

Configuration Identification is the process where multiple activities like labeling, identification, grouping and labeling revision do for the configuration item.

  • Labeling Activity: This activity labels the hardware configuration item and software configuration item with unique identifiers.
  • Identification Activity: This activity looks for the documentation, which describes the (CI) configuration item.
  • Grouping Activity: This activity groups the related configuration items into the Baselines.
  • Labeling Revision Activity: This activity does the labeling revision to baselines and the configuration items.

2. Change Management

It is the change management process. This process handles the changes systematically either change from the individual perspective or organizational perspective.

3. Configuration Status Accounting

Configuration status accounting includes the following activities:

  • To record and report the descriptions of configuration item such as hardware, software, firmware, etc.
  • All departures are made during design and production from the baseline.
  • Verify the baseline configuration and approve the modifications quickly, only in case of suspected problems.

4. Configuration Audit

Configuration audit, able to find the degree at which the current state of the system is consistent with the latest documentation and baseline. Whereas it is, a formal review to confirm that product delivered on time and will work adequately as promised to the customer. It uses the information, which is the outcome of a configuration audit and cross check it with the output of configuration status accounting to ensure that the required product built.

Puppet Architecture

Puppet architecture follows Client-Server/ Master-Slave architecture. Puppet Client and Server communicate with each other through SSL (Secure Socket Layer) which is a secure encrypted channel.

Puppet Architecture

See the above diagram carefully, the puppet agent (puppet node/ puppet client/ puppet slave) sends the Facts to the Puppet Server/ Master. In response to that, the Puppet server sends Catalog to the Puppet agent at the end Puppet agent reports to the Puppet Server.

  1. Puppet agent sends Facts to the Puppet Master. Facts are the key-value data pair, which represents some aspects of puppet client state such as clients IP address, operating system, Up-time and whether client machine is virtual or not.
  2. Puppet Master uses Facts to compile the Catalog. Catalog defines the configuration for the puppet agent. The catalog is a document, which describes the desired state of each resource of a puppet slave managed by the puppet master.
  3. Puppet slave indicates that configuration completed by sending the report to the Puppet Master, which is visible on the dashboard.

Puppet Client-Server communication

Communication between puppet client and puppet server is done through (Secure Socket Layer) SSL.

Puppet Client-Server communication

As the diagram depicts communication between client and server.

  • Puppet slave (client) requests to the Puppet Master for the Puppet Master Certification.
  • Puppet master sends the Master Certificate to the puppet client in response to the client request.
  • Puppet Master requests to the Puppet slave for the puppet slave/client certificate.
  • Puppet Slave sends the slave certificate to the puppet master.
  • Puppet slave requests to the puppet master for data/configuration.
  • Finally, the master sends the configuration/data to the puppet slave as per the request.

Components of Puppet

Following is the list of puppet components:

  1. Manifests
  2. Module
  3. Resource
  4. Factor
  5. M-collective
  6. Catalogs

Let us understand each component in

  1. Manifests

Puppet Master distributes configuration details to every Puppet slave and that configuration details written in Puppet native language. Manifest is nothing but the file describing the configuration details for puppet slave. The extension of the manifest file is .pp (Puppet policy). These files contain puppet programs depicting the configuration for the slave.

Example: User can write a manifest at puppet master. Master creates a file and installs apache server on every slave, those that connected to the puppet master.

  1. Module

data forms puppet module

Collection of manifests and data forms puppet module. Data can be facts, file or Templates. The module has a specific directory structure. Modules help in organizing puppet code. Modules allow puppet code to split into multiple manifests. Modules are nothing but self-contained bundles of data or code.

3. Resource

Resources are the basic unit of system configuration modeling. Each resource defines some aspect of the system such as a specific service or package.

4. Factor

Factor gathers facts or necessary information about puppet agents such as operating system, IP address, Mac address, hardware details, network settings, SSH keys and many more. These facts present in the manifests of the puppet master, available as a variable.

5. M-collective

M-collective is a framework. This framework allows multiple jobs to run/ execute on various puppet agents in parallel. This framework performs numerous functions as follows

This framework interacts with a cluster of puppet agent, whether in large groups or small groups.

This framework attaches filters with the request. This framework uses the broadcast paradigm to send requests to puppet agents. When all agents receive offers at the same time, only those agents start responding to the request whose filters matched with the filters attached to the request.

To call the remote puppet agent, M-collective uses simple command line tools.

It writes custom reports about infrastructure.

6. Catalogs

Catalog defines the desired state of each resource on slave managed by the puppet master. It describes the relationship between resources. It defines compilation for all slaves, which controlled by the master. The puppet master compiles the catalogs through modules (Manifests + data). Master serves the compiled catalog to the puppet agent on its request.

Application of Puppet

Let us see the application of puppet through a case study. Have you played online games ever? Then, you must have heard about Zynga.

Zynga is the world’s most famous and sizeable social game developer. Zynga has an extensive infrastructure. Zynga uses tens of thousands of servers on both public cloud and private data centers. On early days, Zynga using a manual process, Kickstarter and post installs to get hundreds of servers online for configuration management.

Now, will see how Zynga faces problems like Scalability and consistency, issues related to the portability of infrastructure, flexibility problems, infrastructure insights problems, at the time when they were using manual process and Kickstarter:

  1. Scalability and Consistency:

Zynga was experiencing fantastic growth and needed to scale-up its infrastructure. Manual approaches and script based solutions were not capable of meeting their requirements.

  1. Flexibility:

As the property of each game is different from others, it becomes critical for the team to fix the correct configuration to each machine quickly.

  1. Portable infrastructure:

Zynga required the new way so that it can achieve consistent configuration management approach in both public cloud and private data centers.

  1. Infrastructure insights:

For the excellent infrastructure, it is necessary to visualize the properties of each machine automatically.

Zynga realizes the need for the automated process before them scale-downs. The result of that puppet came into the picture. Now, will see how puppet plays a role in Zynga.

Speed of Recovery

Deployment speed of production team increased to deploy the right configuration to the right machine. If system configured inappropriately, then puppet will revert the system to the last desired state.

Speed of Deployment

The operation team delivers the services in the significant ‘time-savings,’ provided by the puppet to the gaming studios.

Consistency of Servers

Puppet has a model-driven framework, which ensures the consistency of servers. Puppet delivers the consistent configuration across Zynga’s servers in a short period using significant ‘time-savings.’

Collaboration

As puppet has a model-driven framework, easily share the configuration across the organization. Puppet ensures the new service delivery is of high quality by allowing to the operation team and developer team to work together.

Reference:

https://en.wikipedia.org/wiki/Puppet_(software)

https://puppet.com/

https://puppet.com/products/how-puppet-works