Monday, November 18, 2013

Ruby bundle and Puppet, both of them behind a proxy

Bundle uses the http_proxy environment variable when it is defined.


But if you are in a puppet exec, do not forget to pass it using the environment keyword ...



Example:

exec { "/usr/bin/bundle install --deployment --without development test postgres aws":
        environment => 'http_proxy=http://192.168.1.23:3128',
        cwd => "${homedir}/gitlab",
        timeout => 3600,
        creates => "${homedir}/gitlab/vendor/bundle"
}



Friday, August 16, 2013

How to disable the blinking green led of a Cubieboard

A Cubieboard is a Raspberry like device, with more memory (1GB instead of 512 MB) and a more powerful processor. I use these beasts at home for some devops experiment.
A cubieboard

The problem with the Cubieboard is its flashing green led which can be very annoying, specially during the night ...


The following command (requires root privileges) shuts down this flashy green led (and make your husband/wife/companion much more "always-on-devices" friendly):



# echo none >  /sys/class/leds/ph20\:green\:led1/trigger

One can check the status of this trigger using the command:

# cat /sys/class/leds/ph20\:green\:led1/trigger
[none] battery-charging-or-full battery-charging battery-full battery-charging-blink-full-solid ac-online usb-online mmc0 mmc1 timer heartbeat rfkill0

More info about these leds here.


Have a good night !

PS: To make things persistent across reboots, this command can be inserted in the /etc/rc.local file (it works at least for Raspbian, it may differ for other OS).

Monday, May 6, 2013

Apache Archiva on a Raspberry PI

Apache Archiva
The purpose of this post is to provide  some hints on how to setup an Apache Archiva artifacts repository server on a Raspberry PI.




A Raspberry PI in a plastic case
In a few words, a Raspberry PI is an affordable credit card sized device, which has a very low power consumption, and runs under Raspbian Wheezy, a Linux Debian derivative (among other operating systems). Other characteristics of interest are :

  • ARM V6 Processor
  • 512 MB of RAM
  • 100 Mb/s Ethernet port
  • 2 USB ports
  • HDMI

My initial intent when I bought a Raspberry was to use it as an artifact repository  to deploy and share my Maven build artifacts (jars, war, ...) with different computers of my home network.

Artifacts repositories have a lot of other functionalities, among them:
  • acting as a specialized Maven proxy which saves network bandwidth if two different Maven instances or more need the same dependencies (artifacts),
  • keeping the Maven service "on" when the Internet access is "off".
As I am using Apache Archiva at work, it was a natural choice for me to use it also at home (and a compatibility choice because I use some Archiva dependent URLs in the SCM history of my pom.xml files - see below when it will come to the Archiva context path) .

So my first attempt was to install Archiva 1.3.6 (the latest stable version) on Raspbian Wheezy and execute it using OpenJDK 7.

#fail #1 : Missing wrapper !

Archiva failed to start because  the Tanuki Java Service Wrapper binary which is embedded in its last stable version has no ARM support. So I got the missing part from the Tanuki site ... and it worked (with a scary warning, more on this below) !

#fail #2 : But too slow !

I found Archiva very slow in this context (and unusable) ...

The explanation for that is the following: OpendJDK 7 itself, under Raspbian, is very slow.

Raspbian is a good choice for using the PI at its full performance level under Linux, but it is a bad choice for Java  (Soft Float Debian is know to be better for that).

#fail #3 : Oracle JVM not running under Raspbian !

Oracle JVM being faster than OpenJDK under ARM platforms, I downloaded the last JDK 7 from Oracle ... and it failed to work (see here for another testimony).

#fail #4 : Spring killed me !

I was about to switch to Soft Float Debian Wheezy when I remembered that Oracle produced in late 2012 a preview of JDK 8 for ARM processors ... and this one was able to work under Raspbian Wheezy.

Full of hope, I used it to launch Archiva 1.3.6 ... but it failed :

2013-05-06 16:02:33,079 [WrapperSimpleAppMain] ERROR org.springframework.web.context.ContextLoader  - Context initialization failed
org.springframework.beans.factory.BeanDefinitionStoreException: Unexpected exception parsing XML document from URL [jar:file:/home/archiva13/apache-archiva-1.3.6/apps/archiva/WEB-INF/lib/redback-configuration-1.2.9.jar!/META-INF/spring-context.xml]; nested exception is java.lang.IllegalStateException: Context namespace element 'annotation-config' and its parser class [org.springframework.context.annotation.AnnotationConfigBeanDefinitionParser] are only available on JDK 1.5 and higher
        at org.springframework.beans.factory.xml.XmlBeanDefinitionReader.doLoadBeanDefinitions(XmlBeanDefinitionReader.java:420)
        at org.springframework.beans.factory.xml.XmlBeanDefinitionReader.loadBeanDefinitions(XmlBeanDefinitionReader.java:342)
...
Caused by: java.lang.IllegalStateException: Context namespace element 'annotation-config' and its parser class [org.springframework.context.annotation.AnnotationConfigBeanDefinitionParser] are only available on JDK 1.5 and higher
        at org.springframework.context.config.ContextNamespaceHandler$1.parse(ContextNamespaceHandler.java:65)
        at org.springframework.beans.factory.xml.NamespaceHandlerSupport.parse(NamespaceHandlerSupport.java:69)

JDK 1.8 is higher than JDK 1.5, isn't it ? It should !

#success #1 : Use the next version of Archiva (1.4-M3)

I had no other choice to download the last standalone development version of Archiva (1.4 series), and ... hurrah !

It worked with Oracle JDK 8 ... and most importantly, at an acceptable speed (don't be afraid by the start-up time, and more particularly by the initial start-up time).

NB: I just changed in conf/wrapper.conf the -Xms and -Xmx settings of the Java heap size respectively to 64MB and 256MB, to better fit with the memory available on the PI (512MB).

There was still a warning in the logs related to the Tanuki wrapper  for ARM 32 bits architecture which was not in sync with the Tanuki jar (lib/wrapper.jar) embedded in the Archiva distribution (unsupported combination).

This one was easy to solve (download the sources, build, and put the binaries in place) :

root@raspberry:/tmp# wget -O wrapper.zip http://sourceforge.net/projects/wrapper/files/wrapper_src/Wrapper_3.5.19_20130419/wrapper_3.5.19_src.zip/download
root@raspberry:/tmp# unzip wrapper.zip
Archive:  wrapper.zip
   creating: wrapper_3.5.19_src/
   creating: wrapper_3.5.19_src/build/
   creating: wrapper_3.5.19_src/doc/
   creating: wrapper_3.5.19_src/src/
   ...
root@raspberry:/tmp# cd wrapper_3.5.19_src/
root@raspberry:/tmp/wrapper_3.5.19_src# export ANT_HOME=/usr/share/ant
root@raspberry:/tmp/wrapper_3.5.19_src# export JAVA_HOME=/opt/jdk1.8/
root@raspberry:/tmp/wrapper_3.5.19_src# sh build32.sh
--------------------
Wrapper Build System
using /tmp/wrapper_3.5.19_src/./build.xml
--------------------
Buildfile: /tmp/wrapper_3.5.19_src/build.xml
...
main:

BUILD SUCCESSFUL
Total time: 4 minutes 8 seconds
root@raspberry:/tmp/wrapper_3.5.19_src# ls -l bin/wrapper lib/wrapper.jar lib/libwrapper.so 
-rwxr-xr-x 1 root root 287313 mai    6 16:33 bin/wrapper
-rwxr-xr-x 1 root root  40225 mai    6 16:34 lib/libwrapper.so
-rw-r--r-- 1 root root 119494 mai    6 16:34 lib/wrapper.jar

Instructions to install the new wrapper:

cp bin/wrapper $ARCHIVA_HOME/bin/wrapper-linux-armv6l-32
cp lib/libwrapper.so $ARCHIVA_HOME/lib/libwrapper-linux-arm-32.so
cp lib/wrapper.jar  $ARCHIVA_HOME/lib


#fail #5 : Archiva 1.4-M3 killed me !

The new Archiva UI was OK and responsive, I was an happy man. I created some user acounts and start to make some tests with Maven (after all, it was all for that) and ... they failed.

Nothing was downloadable from my brand new Archiva Raspberrized instance !

#success #2 : Always, always read and read again the release notes !

Starting from Archiva 1.4-M3, the context path for Archiva is "/" and no more "/archiva".

Hopefully, there is a parameter for that (thank you Olivier Lamy !).

I was then able to test my installation using my Maven settings.xml file and all my projects's pom.xml files (after emptying my ~/.m2/repository directory to validate the whole thing, of course ...).

Update for 1.4-M4 release: as Olivier Lamy points it out in his first comment bellow, since the 1.4.M4 release, the property -DAsyncLogger.WaitStrategy=Block must be used when launching Archiva on the Raspberry in order to keep an acceptable level of performance. More details here.

Conclusion

The good news is that it is feasible to use Archiva on a home network using a PI.

After Gitolite and Archiva, the next logical target for me would be to try the Jenkins adventure to make a step forward the forge@home (on another PI; one service = one PI, these machines are too small !)  ...

I will give a try, unless I wait for a PI like device slightly more powerful for Jenkins (something like that or what we can expect to be the next generation of PIs).

In a next post, I will describe a Puppet module for installing or updating an Archiva 1.4.x instance  on the PI (or elsewhere).

I know that Puppetisation of Archiva like for many other softwares has been made a lot of times.

But since I entered the Puppet world, I do think that there is no such thing as a "Puppet Universal and Reusable Module for middleware X or tool Y".

Every company has its own way of doing or organize things, its own understanding and configuration set of parameters for each software, and that is why I think that it is very difficult to reach a universal level for a puppet module.

I prefer to consider Puppet modules as snippets where I always find new ways of doing things (either better or different), and this level of sharing is sufficient enough for me.

Sunday, March 24, 2013

Installing Git and Gitolite on a Raspberry PI

If you want to give a try to Git @home (and are not interested by GitHub), installing Gitolite on a Raspberry PI is one of the best option you have, even better than installing git on a Synology NAS (in the case you do not want to tweak your Synology device).
A Raspberry PI in a plastic case.

All you have to do is to:
  • order a Raspberry PI model B (less than € 35) and some accessories. Here you have a list of compatible items you can use with a PI. Additionally to my PI, I bought :
    • an USB power supply (~ € 7-12. Beware; output must be at least 1A)
    • a clear moulded plastic case to house the PI  (~ € 6)
    • a Transcend TS32GSDHC10E  32 GB SDHC class 10 memory card (~ € 24. Could be less sized. Initially I wanted to have enough space to host an artifacts repository)
    • an Edimax EW-7811UN USB Nano adapter Wireless 150 Mbps (~ 12 €. required only if you want to place your PI out of reach of an Ethernet plug)
  • allow ssh access to your PI (gitolite or not, this is something you will do during the first setup under Raspbian Wheezy ...)
  • install git and perl on your PI using the usual apt-get install commands
  • useradd a git account
  • follow the installation instructions of Gitolite as you can found them on the GitHub repository. 
Gitolite enables you to setup a server of  git repositories on a dedicated host. Access control is based on user's public SSH keys, and access to each repository can be controlled per user or group of users.

There is no need for a GUI to administrate a Gitolite server; a particular git repository, gitolite-admin (that you have to clone on your workstation as any other repository) is used to create and configure repositories, and to grant rights to users.

The initial conf/gitolite.conf file of the gitolite-admin project gives at the first sight an idea of the main principle of gitolite:

repo gitolite-admin
    RW+     =   bob

repo testing
    RW+     =   @all

You are bob, you have installed gitolite on the PI (and as you give your public key bob.pub  during installation, your are granted to administrate the gitolite-admin repository). Every known user (@all) is granted to use git-fully the repository testing.

If you want to add a new repository named myrepo1 to your gitolite server, you just have to clone the gitolite-admin repository on your development workstation, and add the following lines to conf/gitolite.conf:

repo myrepo1
    RW+     =   bob
    R       =   alice

Alice being a new "read only" player, you also have to add her public key named alice.pub into the keydir directory of the gitolite-admin project.

Add, commit, and push; et voilà ! Your new repository myrepo1 is  clonable by Bob and Alice.

Instructions for deleting a repository can be found here.

Even if you do not use your Synology device to host a git server, you can still use it to backup your Raspberry's Gitolite repositories with rsync. This is another story ...




References:



Saturday, March 23, 2013

Environment variables and Maven

Recently, due to security restrictions, like others, I lost the VPN access to my office.

I was  still able to access to our source code repository from the outside, but VPN provided me a remote access to our internal artifacts repository (Apache Archiva), and thus the same conditions to develop remotely  (compile, deploy, release) that I have at work.

The workaround I found to continue to work nearly as before was the following:

1) Install Archiva on my workstation. It was a good occasion for me to use Puppet for that.

2) Import (tar cf / xf) artifacts required (not rebuildable, what I call "postulate artifacts") in this Archiva instance.

3) Use environment variables pointing to an artifact repository in my setup.xml (local to each development environment) and in all of our projects pom.xml (in SVN). For instance:




Obviously, A_HOST and A_PORT must be defined in all development environments (cli, Maven, Jenkins, IDEs) , but this is not a big deal. 

The only thing I lost comparing to the VPN way was the access to THE company repository, allowing me to deploy or release versions remotely, but this is no surprise ...

Monday, March 11, 2013

Installing ElasticSearch using Puppet

Installing ElasticSearch using Puppet 

I am currently working on two great technologies: Puppet and ElasticSearch.

The purpose of this post is to describe a Puppet module which deploys ElasticSearch and selected plugins as a service in a Linux or Mac-OS environments.
You should try ElasticSearch ...

I am not an expert of these technologies; what I describe here is just a snapshot of what I am able to do at a given point in time (2013, March) of my Puppet & ElasticSearch learning curve...

This kind of work has already been made a lot of time, and it's my turn !

Limitations

This module has been tested against OpenSuse 12.2 and Mac-OS 10.6.8, ElasticSearch 0.20.5 and the latest version of the ElasticSearch service wrapper as I found it on GitHub around mid-february 2013.

I tested the ElasticSearch resulting installations in the following ways:
  • batch indexing on the OpenSuse (master) node (180 millions of documents distributed over 8 shards)
  • replication on Mac-OS nodes (puppetized or not)
  • search on all types of covered nodes
  • usage of the head plugin
  • upgrading / downgrading in place with 0.20.5 and 0.20.6 versions of Elasticsearch
No more warranties, this is still unfinished work !

Context

  • The Puppet Master is hosted on a Raspberry PI model B running Raspbian Wheezy (Puppet: 2.7.18-2, Ruby: 1.8.7). No compilation of Puppet from source code; I have used what is coming with the distribution (the same stands for OpenSuse 12.2). Modules are managed using git, thank's to Git and my Synology Diskstation.
  • Puppet (2.7.18) and Facter (1.6.11) on Mac-OS come from the Mac-OS download puppet site. It is on purpose that I did not used the latest versions available in each case: my intention was to keep the puppet agents at the same level than the master. The installation of these disk images is not sufficient enough to launch the puppet agent at Mac-OS start-up; do not forget to read this.
  • The version of the puppet agent on OpenSuse 12.2 is 2.7.6-4.1.2. It runs under Ruby 1.9.3-2.2.1.  Talking about Ruby 1.9.3 and Puppet, I remember that this post helped me a lot to get my client cert signed by the puppet master. 
I will not describe here how to setup the Puppet Master and its agents, as this topic is largely covered by multiple resources on the Internet.

Prerequisite

ElasticSearch and the service wrapper zip binaries must be located under the files/elasticsearch mount point of the Puppet Master (the service wrapper adds the multi-platforms service functionality to ElasticSearch, thank's to Tanuki Software which is also used (and embedded) to my knowledge in Apache Archiva).

/etc/puppet/fileserver.conf must contain a section resembling to:

[files]
  path /etc/puppet/files
  allow *.your.domain

Under the path /etc/puppet/files, there should be a directory for the ElasticSearch module containing the following files (wrapper.zip being a zip archive built using the git archive command and elasticsearch-VERSION.zip being the version of ElasticSearch that has to be puppet distributed):

/etc/puppet/files
└── elasticsearch
    ├── elasticsearch-0.20.5.zip
    └── wrapper.zip

Module parameters


This module is implemented as a single puppet class with the following parameters:
  • els_username which defaults to elasticsearch, is the Unix user name used to run the ElasticSearch node.
  • els_uid is the UID of els_username.
  • els_groupname which defaults to elasticsearch, is the Unix group name of els_username.
  • els_gid is the GID of els_groupname.
  • els_homedir is the directory under which ElasticSearch will be deployed.
  • els_env is the path where the template env will be created. The resulting file aims at gathering environment variables and system settings. It is sourced by the service wrapper script (see bellow).
  • els_version is the version of ElasticSearch that has to be installed. There must be an existing elasticsearch-<els_version>.zip file under the /etc/puppet/files/elasticsearch directory (see pre-requisites above).
  • els_clustername is the ElasticSearch cluster name of the node (all the nodes with the same cluster name belongs to the same cluster).
  • els_nodename is the ElasticSearch node name. 
  • els_masters is an array of ElasticSearch masters (see bellow an explanation of why I did not use multicasting which is the default for ElasticSearch).
  • els_node2node_port which defaults to 9300, is the port number used by ElasticSearch  nodes to communicate between them.
  • els_http_port which defaults to 9200, is the port number used to communicate with the cluster (eg: REST API, site plugins, ...).
  • els_is_master which defaults to false, indicates if the node is a master node. 
  • els_is_data which defaults to true, indicates if the node is a data node. 
  • els_nb_of_shards which defaults to 5, is the default number of shards.
  • els_nb_of_replicas which defaults to 1,is the default number of replicas for each shard.
  • els_heap_size which defaults to 1024,is the heap size of the JVM of the ElasticSearch server. Units are in mega bytes, and usual letters (G,M) must not be used as they generate an error with the service wrapper script. 
  • els_plugins which defaults to [], is an array of the plugins which must be installed on each node.  Each plugin is described by a space separated string: the first part of the string is the name of the plugin as it is passed as a parameter to the bin/plugin ElasticSearch shell script for installation, the second part of the string is the name of the directory which is created under the ElasticSearch plugins directory after the installation of the plugin is completed. This trick is used to check that the plugin has already been installed. Example of such string: mobz/elasticsearch-head head.
  • els_ensure_running which defaults to false, indicates if Puppet should verify that ElasticSearch is running as a service.  

This class is able to update ElasticSearch just by changing the value of the els_version parameter on the puppet master node file , as far as no more than a service restart is required (and obviously after having downloaded into the /etc/puppet/files/elasticsearch puppet master directory the corresponding binary zip file).

Note that plugins are not updatable nor removable. They can just be installed. It should not be so difficult to make them more puppet dynamic, for instance by adding an action sign before the plugin name (! to force reinstall, - to remove, ...) and modifying the install_plugin define (see bellow).

Usage example


node 'mini.your.domain' {
   class { 'elasticsearch' :
      els_uid => 1963,
      els_gid => 1963,
      els_heap_size => 1536,
      els_homedir => "/Users/elasticsearch",
      els_env => "/Users/elasticsearch/elasticsearch.env.sh",
      els_version => "0.20.5",
      els_nodename => "elasticsearch@$fqdn",
      els_clustername => "mycluster",
      els_masters => [ 'mahina.your.domain' ],
      els_nb_of_shards => 8,
      els_ensure_running => true,
      els_plugins => [
         'mobz/elasticsearch-head head',
         'elasticsearch/elasticsearch-mapper-attachments/1.6.0 mapper-attachments'
      ],
   }
}


The module's files

The module is made up of an init.pp file and three templates (one for the ElasticSearch configuration, one for installing it as a MacOS service, and the last one for hosting environment variables used by the service wrapper).


manifests/init.pp

This is the main file. It creates a tree that will look like:

homedir
├── downloads
│   ├── elasticsearch-0.20.5.zip
│   └── wrapper.zip
├── elasticsearch-0.20.5
│   ├── bin
│   │   ├── elasticsearch
│   │   ├── elasticsearch.bat
│   │   ├── elasticsearch.in.sh
│   │   ├── plugin
│   │   ├── plugin.bat
│   │   └── service
│   │       ├── elasticsearch
│   │       ├── ...
│   ├── ... 
│   ├── config
│   │   ├── elasticsearch.yml
│   ├── lib 
│   │   ├── elasticsearch-0.20.5.jar
│   │   ├── ... 
│   ├── plugins
│   │   ├── head
│   │   │   └── ... 
│   │   └── mapper-attachments
│   │       ├── ... 
├── elasticsearch.env.sh
├── elasticsearch_content
│   ├── data
│   ├── log
│   └── piddir 
└── elasticsearch_current -> /home/elasticsearch/elasticsearch-0.20.5


First of all, the group and the user under which ElasticSearch will run are created if they are not present in the target node. The same stands for the home directory.

Under the home directory, a download directory is created. This directory will be used to store  the downloaded files from the puppet master fileserver.

An elasticsearch_content directory is also created. It will be used to store all the content of a running node (shards, log files, ...), independently of the version of ElasticSearch itself. Under this directory, a piddir directory is created, it will be used by the service wrapper to store its status independently of the version of ElasticSearch it is bind to.

Then the ElasticSearch zip file is downloaded and stored into the download directory. It is unzipped under the home directory using its zip native name: elasticsearch-<els_version> (e.g.: elasticsearch-0.20.5).

Permissions of the ElasticSearch unzipped directory (and its sub-directories) are fixed thanks to Perl and find. Just a word about that: I think that Perl is a good match with puppet for Unixes boxes, as it is portable for simple things, far more portable than sed for instance for the same kind of tasks.

A link is established between elasticsearch-<els_version> and elasticsearch_current.

Note that prefixing content and current directories with elasticsearch allows this class to deploy ElasticSearch with other middlewares under the same home directories using the same deployment pattern (current/content).

Now that an ElasticSearch version is available, specified plugins are installed using the ElasticSearch plugin script (see the define install_plugin).

The ElasticSearch configuration file (elasticsearch.yml) is then created using the Puppet templating mechanism.

Next, the service wrapper is downloaded from the Puppet master, unzipped at the right place (under the bin directory of the ElasticSearch installation), and the service file is patched (thanks to Perl) to include the environment file that will be created at the next step, and to set the proper values for PIDDIR and ES_HOME. These two variables must not be left to their default value in order to properly restart ElasticSearch when a new version is puppet-installed.

The final step is dedicated to the service installation. Depending on the kind of target node, a plist file is installed (MacOS) or a link is set under /etc/init.d (Linux).

In both cases, the service is installed, thanks to Puppet.

class elasticsearch (
   $els_username = elasticsearch,
   $els_uid,
   $els_groupname = elasticsearch,
   $els_gid,
   $els_homedir,
   $els_env,
   $els_version,
   $els_clustername,
   $els_nodename,
   $els_masters,
   $els_node2node_port = 9300,
   $els_http_port = 9200,
   $els_is_master = false,
   $els_is_data = true,
   $els_nb_of_shards = 5,
   $els_nb_of_replicas = 1,
   $els_heap_size = 1024,
   $els_plugins = [],
   $els_ensure_running = false
) {
   $els_name="elasticsearch-${els_version}"
   $els_base="$els_homedir/$els_name"
   $els_current="$els_homedir/elasticsearch_current"
   $els_content="$els_homedir/elasticsearch_content"
   $els_downloads="$els_homedir/downloads"
   $els_datadir="$els_content/data"
   $els_workdir="$els_content/work"
   $els_logdir="$els_content/log"
   $els_piddir="$els_content/piddir"
   $els_wrapper_script="$els_current/bin/service/elasticsearch"

   if ($operatingsystem == "Darwin") {
      $els_notify="Service[org.tanukisoftware.wrapper.elasticsearch]"
   } else {
      $els_notify="Service[elasticsearch]"
   }

   File { 
      owner   => $els_username,
      group   => $els_groupname, 
      mode    => '0644',
   }

   Exec {
      user    => $els_username,
      group   => $els_groupname,
      cwd     => $els_homedir,
      path    => "/usr/bin/:/bin",
      timeout => 900,
   } 

   define install_plugin {
      $p_array=split($title,' ')
      $p_name=$p_array[0]
      $p_dir=$p_array[1]

      exec { "install elasticsearch plugin $p_name":
         command   => "$els_current/bin/plugin -install $p_name",
         logoutput => true,
         creates   => "$els_current/plugins/$p_dir",
      }
   }

   group { "$els_groupname" :
      ensure  => present,
      name    => $els_groupname,
      gid     => $els_gid,
   }
   
   user { "$els_username" :
      require => Group[$els_groupname],
      ensure  => present,
      name    => $els_username,
      uid     => $els_uid,
      gid     => $els_groupname,
      shell   => '/bin/bash',
      home    => $els_homedir,
      comment => 'ElasticSearch User',
   }

   file { "$els_homedir" :
      require => User[$els_username],
      ensure  => directory,
      mode    => 755,
   }

   file { "$els_content" :
      require => File[$els_homedir],
      ensure  => directory,
      mode    => 755,
   }
   
   file { "$els_piddir" :
      require => File[$els_content],
      ensure  => directory,
      mode    => 755,
   }
   
   file { "$els_downloads" :
      require => File[$els_homedir],
      ensure  => directory,
      mode    => 755,
   }
  
   file { "$els_downloads/$els_name.zip" :
      require => File["$els_downloads"],
      source  => "puppet:///files/elasticsearch/$els_name.zip",
   }
 
   exec { 'unzip elasticsearch' :
      require => File["$els_downloads/$els_name.zip"],
      command => "unzip $els_downloads/$els_name.zip",
      creates => "$els_base",
   }

   # Perms in zip file are too wide (777)
   exec { 'fix directories perms elasticsearch' :
      require => Exec['unzip elasticsearch'],
      command => "find $els_base -type d -exec chmod go-w {} \;",
      onlyif  => "perl -e 'exit(sprintf(\"%o\", (stat(\"$els_base\"))[2]&00077) ne \"77\")'",
   }

   file { "$els_current" :
      require => Exec['unzip elasticsearch'],
      ensure  => link,
      target  => "$els_base",
      notify  => $els_notify,
   }

   install_plugin { $els_plugins :
      require => File["$els_current"],
      notify  => $els_notify,
   }

   file { "$els_current/config/elasticsearch.yml" :
      require => File["$els_current"],
      content => template("elasticsearch/elasticsearch.yml"),
      notify  => $els_notify,
   }

   file { "$els_downloads/wrapper.zip" :
      require => File["$els_downloads"],
      source  => "puppet:///files/elasticsearch/wrapper.zip"
   }

   exec { 'install elasticsearch wrapper' :
      require => [ File["$els_current"], File["$els_downloads/wrapper.zip"] ],
      cwd     => "$els_current/bin",
      command => "unzip '$els_downloads/wrapper.zip'",
      creates => "$els_current/bin/service",
   }

   exec { 'source env in elasticsearch wrapper':
      require => [ Exec['install elasticsearch wrapper'], File["$els_env" ] ],
      command => "perl -pi.bak -e 'print \". $els_env # KILROY WAS HERE\n\" if $. == 2' $els_current/bin/service/elasticsearch",
      onlyif  => "grep -v '# KILROY WAS HERE$' $els_current/bin/service/elasticsearch",
      creates => "$els_current/bin/service/elasticsearch.bak",
   }

   exec { 'fix elasticsearch wrapper':
      require => Exec['source env in elasticsearch wrapper'],
      command => "perl -pi.fix -e 's|^PIDDIR=\".\"$|PIDDIR=$els_piddir|; s|^export ES_HOME=.*$|export ES_HOME=$els_current|' $els_current/bin/service/elasticsearch",
      creates => "$els_current/bin/service/elasticsearch.fix",
   }

   file { "$els_env" :
      require => [ File["$els_homedir"], User["$els_username"] ],
      content => template("elasticsearch/env"),
      notify  => $els_notify,
   }

   if ($operatingsystem == 'Darwin') {
      file { '/Library/LaunchDaemons/org.tanukisoftware.wrapper.elasticsearch.plist':
         require => Exec['fix elasticsearch wrapper'],
         content => template("elasticsearch/elasticsearch.plist"),
         owner   => root,
         group   => wheel,
         mode    => 0644,
      }

      service { 'org.tanukisoftware.wrapper.elasticsearch' :
         require => File['/Library/LaunchDaemons/org.tanukisoftware.wrapper.elasticsearch.plist'],
         enable  => true,
         ensure  => $els_ensure_running,
      }
   } else {
      file { '/etc/init.d/elasticsearch':
         require => Exec['fix elasticsearch wrapper'],
         ensure  => link,
         owner   => root,
         group   => root,
         target  => "$els_current/bin/service/elasticsearch",
      }

      service { 'elasticsearch' :
         require => File['/etc/init.d/elasticsearch'],
         name    => "elasticsearch",
         enable  => true,
         ensure  => $els_ensure_running,
      }
   }
}

templates/elasticsearch.yml

This is the configuration file used for all the nodes of the cluster, whereas they are master or data nodes. 

It is rather simple and probably far away from what is required in a production environment. 

Just a word about it: I am not using multicasting to discover master nodes (see the <% if !els_is_master %>  section below) because of my home network peculiarities and what I want to do with ElasticSearch at the present time. I guess that this is not a good practice, and this is another one good reason to adapt this file to your needs.

cluster.name: "<%= els_clustername %>"
<% if @els_nodename %>
node.name: "<%= els_nodename %>"
<% end %>
node.master: <%= els_is_master %>
node.data: <%= els_is_data %>
index.number_of_shards: <%= els_nb_of_shards %>
index.number_of_replicas: <%= els_nb_of_replicas %>
path.data: <%= els_datadir %>
path.work: <%= els_workdir %>
path.logs: <%= els_logdir %>
transport.tcp.port: <%= els_node2node_port %>
http.port: <%= els_http_port %>
<% if !els_is_master %>
discovery.zen.ping.multicast.enabled: false
discovery.zen.ping.unicast.hosts: [<% els_masters.each do |host| -%> <%= host -%>, <%end -%> ]
<% end %>


templates/env

This file is sourced by the service wrapper (see how above in the init.pp file). The first three export lines are used by the wrapper. The two following lines may be used by ElasticSearch clients (such as scripts made up of curl requests or Java program) to access to the cluster.

The last line purpose is to give the indexing nodes enough resources to index a big dataset (in my case a corpus of 1.3 billion of words). I guess that it has to be transformed into a class parameter, and maybe under node type conditions (master node, data node only, ...).

export RUN_AS_USER=<%= els_username %>
export ES_HOME=<%= els_current %>
export ES_HEAP_SIZE=<%= els_heap_size %>
export ES_HTTP_PORT=<%= els_http_port %>
export ES_NODE2NODE_PORT=<%= els_node2node_port %>

# Number of files that can be opened simultaneously
ulimit -n 4096

templates/elasticsearch.plist

This file is specific to MacOS, it is used to install ElasticSearch as a service.

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple Computer//DTD PLIST 1.0//EN"
"http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
    <dict>
        <key>Label</key>
        <string>org.tanukisoftware.wrapper.elasticsearch</string>
        <key>ProgramArguments</key>
        <array>
            <string><%= els_wrapper_script %></string>
            <string>launchdinternal</string>
        </array>
        <key>OnDemand</key>
        <true/>
        <key>RunAtLoad</key>
        <true/>
        <key>UserName</key>
        <string>elasticsearch</string>
    </dict>
</plist>


Resources

Internet resources put aside (among them this one), I appreciated particularly the content of these three books, each of them having its own interest.

Updates

  • 2013/03/31: update to the init.pp code related to a problem with ElasticSearch versions update. When the version parameter was changed (0.20.5 to 0.20.6), the service was not restarted because the new installed wrapper did not point to the actual pidfile (which were located in a sub-directory of the previous version of ElasticSearch). 
  • 2013/04/11: 
    • Forgot the elasticsearch.plist template file for MacOS
    • Tested this module on Suse SLES 11SP2 where Puppet is currently at version 2.6.17-0.3.1. I made two updates:
      • remove the comma at the end of the last parameter line of the class definition. It can be there in 2.7, but it generates a syntax error in 2.6.
      • change the require of file["$els_current/config/elasticsearch.yml"]: it must depend on File["$els_current"]. I did not see this bug in 2.7. 

Sunday, January 20, 2013

When chown does not work silently on MacOS

Each Unix flavor has its very own personality.

Today, with MacOS, I spent two hours searching why a  sudo chown -R did not work properly on a directory tree.

Exit code: 0
Error message: none

Nothing done ! My files remained, try after try, with their original ownership which was:

_unknown:_unknown

Mysticism.

Some context: these files, located on an external drive (MacOS extended formatted), resulted from a tar copy of a user account folder contained on a disk which belonged to an early 2008 MacBookPro who just died from a Nvdia attack a few weeks after the limit (The NVIDIA GeForce 8600M GT graphics processor repair extension program ended on December 7, 2012. Sigh). 

I found the answer when I realized that the external property of the drive was the key. The solution was here (verbatim):

From the output, it is clear that the drives have "ownership" disabled, or owners aren't supported under OS X (eg. if the drive is FAT formatted).

When ownership on a volume is disabled, everything is treated as if it is owned by the user and group "unknown" ('uid' and 'gid' 99). Items owned by "unknown" have the property of appearing to be owned by the owner of the process attempting to access the item, and only "root" can see their true nature (in 10.4 and later). In this context, 'chown' isn't meaningful.

To see if ownership is enabled, use "Disk Utility.app", and possibly "Get Info" in the "Finder". Or else, on the command line, eg:<pre>
vsdbutil -c /Volumes/terra</pre>

Ownership can be enabled (assuming the volume format supports it) using the GUI, or with eg.:<pre>
/usr/bin/sudo /usr/sbin/vsdbutil -a /Volumes/terra</pre>


I do not know who you are, biovizier (the author of the answer), but I thank you very very much !

What I did not understand though was the fact that this kind of chown was working the day before on the same drive ... I certainly did something wrong (no more mysticism !), but I still do not know what ...

Oh ? And what about TimeMachine to recover lost stuff ? No luck this time: backup was restarted from scratch since a few days on the laptop before it died, and it was still "in progress". I was unaware of the danger: in this case, your only fresh full backup is your hard drive (as TimeMachine was still in the way, the "in progress" data contained only 30GB out of 160GB). Shiver.

My advice: when TimeMachine restarts the backup from scratch, put right away your laptop on the wire to accelerate the backup and think about the last time you did a SuperDuper clone of it ...

Friday, January 4, 2013

How to setup a git server on a Synology NAS

Context

  • Software development (Linux, MacOS, Java / Eclipse)
  • Home network (security is not a concern in this context)
  • Synology NAS (DS 112j, DSM 4.1) 
Important update (august 31th 2013):  since I wrote this post, Synology has incorporated a Git server with the DSM 4.3. So what follows is deprecated...

Nevertheless, you may find that it is far more simpler and powerful to setup a git server on a Raspberry PI using Gitolite.

This memo describes how to setup a Git over SSH server on a Synology NAS. Use it at your own risk ! Your context certainly differs from mine ...



Enable SSH on the Synology
First of all, SSH must be activated on the NAS. This can be done using the  Control Panel of the DSM and the Terminal app. SSH will be used to connect to the Synology, and as the communication protocol for Git.

Bootstrap ipkg

As git is not installed by default on Synology devices (like many useful Unix commands), one has to extend the DSM using ipkg, the Itsy package manager, dedicated to Debian based embedded devices.

More information (and disclaimers) about modifying Synology devices can be found here.

Identify the bootstrap file

The first thing to do is to identify what is the corresponding bootstrap file for your Synology device (popular bootstrap URLs are here).

Log into the NAS

Then  log into the Synology device (diskstation2 in this example) using ssh:

% ssh root@diskstation2
root@diskstation2's password: 

BusyBox v1.16.1 (2012-09-26 03:28:29 CST) built-in shell (ash)
Enter 'help' for a list of built-in commands.

DiskStation2#

(I changed the original prompt which ends with a redirection sign ">". I hate redirection signs in prompts...)

Get the bootstrap file

Get the bootstrap file (for a DS112j target in this example) in a temporary directory and install it:

DiskStation2# cd /volume1/@tmp
DiskStation2# wget http://ipkg.nslu2-linux.org/feeds/optware/cs08q1armel/cross/unstable/syno-mvkw-bootstrap_1.2-7_arm.xsh



Run the bootstrap

DiskStation2# sh syno-mvkw-bootstrap_1.2-7_arm.xsh

then do some cleaning:

DiskStation2# rm syno-mvkw-bootstrap_1.2-7_arm.xsh


Modify root's user PATH

Modify root path in .profile: include /opt/bin and /opt/sbin at the beginning of the PATH variable (/opt is where ipkg installs things, /opt/sbin being useful if you want to install commands like lsof).


Reboot the NAS 

(don't forget to warn your family users ...)

DiskStation2# reboot now

(or use the web interface, in any case, you must hear the born again beep ...)

Install packages

Log again and install required packages:

DiskStation2# ipkg install coreutils


DiskStation2# ipkg install git





Make symbolics links of git commands into /usr/bin 



This is a simple way to make these commands available via sshd.

DiskStation2# for f in `ls /opt/bin/git*`
> do
> ln -s $f /usr/bin     
> done

Alternative: use ~git/.ssh/environment and modify /etc/ssh/sshd_config to set PermitUserEnvironment to yes as explained here.

Create a git user on the NAS

Use the web interface to create a git user (belonging to the group users) with no privileges.

Several extra steps are required :

As root user
  • Change some fields of the git user in /etc/passwd (using vi)
    • Replace the home directory
      • /var/services/homes/git to /volume1/git
    • Replace the login shell
      • /sbin/nologin to /bin/ash
As git user
  • Become the git user : DiskStation2# su - git
  • Just to be sure, check that the git user homedir belongs to the git user (this should be the case)
  • Create a .profile file 
    • Copy and adapt the root's .profile file
    • Don't forget to change the HOME variable accordingly (I did not understood why the HOME var is not setup automatically ...)
  • chmod to 700 the git user homedir
  • create a  .ssh folder within the git user homedir, again with 700 perms
Note: I realized sometime later that relocating the homedir of the git user on /volume1 had the side effect of creating automatically (at the next reboot) a git share. Using the web interface, I then disabled this share for the members of the users group, and hide it from the Network places.

Authorize git users

As the git user just created on the Synology will be accessed for git purposes using the SSH protocol by your development account, you must add its public key in the ~git/.ssh/authorized_keys file (whose permissions must be 600).

Doing this will also let you access to the git account of the Synology from your development account securely and without supplying a password.

I used vi and copy / paste between xterms  to copy my public key in the authorized_keys file.

If your are not familiar with SSH and public/private keys, you should read this article (how to create a private/public key pair, how to include the public key in an remote authorized_keys file).

If the following command fails (replace diskstation2 by the IP name of your NAS) or prompts you for a password, it means that your SSH setup is not correct:

myself@myhost:~/% ssh git@diskstation2 ls /etc/shells
/etc/shells

Test the git server

On the Synology

  • Create a repositories folder in the git homedir
  • cd to it, then create a git repo:
git@DiskStation2% echo $PWD
/volume1/git/repositories

git@DiskStation2% mkdir aiuto.git
git@DiskStation2% cd aiuto.git/
git@DiskStation2% git init --bare
Initialized empty Git repository in /volume1/git/repositories/aiuto.git/
git@DiskStation2% ls
HEAD  branches  config  description  hooks  info  objects  refs

On a client workstation

myself@myhost:~/dev/projects% git clone ssh://git@diskstation2/volume1/git/repositories/aiuto.git
Cloning into 'aiuto'...
warning: You appear to have cloned an empty repository.
myself@myhost:~/dev/projects% ls -a aiuto/
.  ..  .git
myself@myhost:~/dev/projects% cd aiuto/
myself@myhost:~/dev/projects/aiuto% echo "V0.1 - 2 janvier 2013" > Changes.txt
myself@myhost:~/dev/projects/aiuto% git add Changes.txt
myself@myhost:~/dev/projects/aiuto% git commit -m "Changes.txt file created - release notes"
[master (root-commit) 82fcd36] Changes.txt file created - release notes
 1 file changed, 1 insertion(+)
 create mode 100644 Changes.txt
myself@myhost:~/dev/projects/aiuto% git push origin master
Counting objects: 3, done.
Writing objects: 100% (3/3), 267 bytes, done.
Total 3 (delta 0), reused 0 (delta 0)
To ssh://git@diskstation2/volume1/git/repositories/aiuto.git
 * [new branch]      master -> master


References

Links bellow helped me to work on this topic.

http://www.wonko.de/2010/04/set-up-git-on-synology-nas.html
http://www.bluevariant.com/2012/05/comprehensive-guide-git-gitolite-synology-diskstation/
http://stackoverflow.com/questions/10888300/gitosis-vs-gitolite (reading one of the answers, I decided to not use gitolite to begin with git; being the only software developer in casa ...).


Note: An alternative to install git on a Synology device
A Raspberry PI 512MB in a box.
Git friendly !
 is to install it on a Raspberry PI. For those of you who don't know, a Raspberry is an affordable credit card sized computer, running under Linux (particularly Raspbian which is a Debian derivative), equipped with 512MB of memory, and which uses using a SD Card as a storage device.