Static code analysis in Python with Jenkins

I am checking the code quality of a python script called harvester.py which is in the root folder of my code .

To do so I installed the following packages (I am with Python 27 on Windows machine):

pip install pylint
easy_install -U clonedigger
pip install flake8

On Jenkins I make sure I setup Warning plugin and Violation plugin. In Jenkins configuration under Compiler Warning section we add a parser for flake8 with:

Name: flake8
Link name: Flake8 warnings
Trend report name: Flake8 warnings trend
Regular expression: ^(.*):([0-9]*):([0-9]*):(.[CE][0-9]*)(.*)$
Mapping script:
import hudson.plugins.warnings.parser.Warning
import hudson.plugins.analysis.util.model.Priority

String fileName = matcher.group(1)
String lineNumber = matcher.group(2)
String category = matcher.group(4)
String message = matcher.group(5)

return new Warning(fileName, Integer.parseInt(lineNumber), category, "PyFlakes Parser", message, Priority.NORMAL);;

Note from the regular expression the [CE] pattern in order to find Complexity and Error id problems. If you are not interested in the complexity just replace [CE] with E.

Now on the jenkins job configuration we add a build step “Execute Windows batch command” (you could do it similarly for Linux) with the following content:

rmdir /s /q output
mkdir output
pylint --msg-template="{path}:{line}: [{msg_id}({symbol}), {obj}] {msg}" --reports=y harvester.py >> output/pylint.log
clonedigger --cpd-output harvester.py -o output/clonedigger.xml
flake8 --max-complexity 12 harvester.py --output-file output/flake8-output.txt

The batch creates an output folder where pylint, clonedigger and flake8 generate their files (if you want you could put the script inside in a bat file having as parameters the output folder and the harvester.py file).

Add a post build action “Scan for compiler warnings”, select “scan workspace files” and with the following parameters:

File patterns: output/flake8-output.txt
Parser: flake8

Add a post build action “Report violations”  with the following parameters:

cpd: output/clonedigger.xml
pylint: output/pylint.log

Run the build and you should see the trend graph for Flake8 and the violations for cpd (clonedigger) and pylint!

P.S.:

If Flake8 exits with exit 1 failing your build, you might probably change the file:

C:\Python27\Lib\site-packages\flake8\main.py

At the end of the def main(): function you should have:

if exit_code > 0:
    raise SystemExit(exit_code > 0)

Which you can replace with:

raise SystemExit(0)

Now Flake8 should exit without problem.

 

 

 

 

 

Drupal and Open data

The “open data” topic is always in the air and, thanks to the open source world, it is becoming more and more spread.

When we want to manage open data we need to consider 2 types of software:

  1. linked data applications;
  2. data catalogs.

The first type is oriented towards the use of triple store databases specialized in store RDF triples, we find solutions like:

We find as well some REST API which aim to connect to triple stores:

The second type is oriented to the management of the open data:

Drupal has also a couple of interesting modules to connect to triple stores:

Another interesting module is RDFx which allows to manage RDF mapping with Drupal Content types  and it works in combination with the RESTWS module so you can have the RDF extraction of a content type by simply adding “.rdf” to your node (for example http://localhost/drupal/node/1.rdf)

Well if you find more open source applications, which might connect to Drupal, let me know :-)

 

Connect Drupal 7 and Github via Oauth

In a project where I am working on there is a need to connect Drupal with Github.

Github offers an interesting API which is based on the Oauth protocol like Google and Twitter.

The first thing to do is creating an account on Github and on the profile page add an application url even setup in localhost (like http://localhost/drupal). When done, you will receive a client id and a client secret which will be used by Drupal to connect as client.

I setup Drupal on my laptop via Xampp, which means creating quickly a user “drupal” with a database “drupal” via phpmyadmin and then extract Drupal in my htdocs folder (using drupal as folder name) and follow the instructions at http://localhost/drupal.

On Drupal you need download and setup the Oauth connector module which comes with some dependencies, in particular the Oauth module which can be used for Oauth2.

Once everything is setup you can start to configure and for this you can follow the steps in this guide for  Facebook:

http://fuseinteractive.ca/blog/how-allow-facebook-logins-your-drupal-site#.VRwyEjuUdLZ

with the data used from here (I just forked the original project):

https://github.com/EmidioStani/drupal-oauthconnector-providers/blob/master/GitHub.php

as you can see there is a mapping to be done between the github profile attributes and the drupal profile attributes. Since the Github name attribute is optional I preferred to use the login attribute (you can see some attributes from my profile), so just change it in:

'name' => array(
'resource' => 'https://api.github.com/user',
'method post' => 0,
'field' => 'login',
'querypath' => FALSE,
'sync_with_field' => 'name',
),

Unfortunately the avatar will not be copied but you can contribute to the module :-)

When you will connect to Github you might experience 2 types of problem:

If everything goes fine, at the end you will have a github button at the login which redirect the user to Github and come back to Drupal.

Be careful that when you come back to Drupal you have a session cookie therefore the new user is authorized but, since Drupal by default blocks the user (because by default the administrator should authorize and the account should be confirmed by email), you cannot login with your account; also my Drupal doesn’t send emails. So I suggest to remove the cookie from your browser, login with the administrator and unblock the user (you will be able to login with the github account) and change the options in Drupal to allow people registering automatically without waiting that the administrator unblocks the user.

With the last changes you would expose Drupal to security issue but I didn’t try it a Drupal hosted on a website and with email confirmation.

 

 

 

 

Avoid XML Schema restrictions

I am back on XML Schema design and I do really like it !

One of the challenge that I am facing now is including or not the UBL metalanguage in my schemas (see some video on the oasis website), an Oasis standard, as you can deduce.

As you can see from this file:

http://docs.oasis-open.org/ubl/os-UBL-2.1/xsd/common/UBL-UnqualifiedDataTypes-2.1.xsd

UBL metalanguage has elements which are based on the UN/CEFACT elements, see:

http://docs.oasis-open.org/ubl/os-UBL-2.1/xsd/common/CCTS_CCT_SchemaModule-2.1.xsd

So as you can see UN/CEFACT has ComplexType which are extended by adding optional attributes, while the UBL elements are sometime restrictions  on the attributes (by making them required) and sometimes the elements are just extended giving a freedom in a later moment to choose what to add.

Independently from  the current problem, it happens that we might not need all the attributes or simply all the elements for our schema.

So either we add restrictions or we copy the elements that we need in our schema but simplified.

Adding restrictions can be possible in two ways:

1) Restrict on the patterns, like the max length of an element, they are so called facets, see: http://www.w3.org/TR/xmlschema-0/#SimpleTypeFacets

2) Restrict on the number of elements or attributes, in the case of the attributes  we need to make them prohibited

Conceptually both are restrictions based on an another type and this creates a problem in Object Oriented languages, like Java (hence the tittle of this post) which supports extensions (at the end a restriction is a sort of extension of a base object with some changes).

Jaxb is the official standard used for databinding XML elements to Java objects which relies on XJC for the conversion. Such standard adds Java annotations to associate XML elements to Java objects. Further, Jaxb relies on the validator of the marshaller by using the setSchema() method, see for example the post of Blaise Doughan, so in any moment you can validated your object against the schema to be sure before sending out your xml message.

For the facets, Jaxb doesn’t create annotations, there is still an open issue where 2 subgroups are working to solve it but still none of them are officially approved:

Other data bindings work on simple restrictions like enumerations (see Jibx) or numeric type and enumerations (see Xmlbeans, now archived).

You need also to consider that facets can change in the future (the max length could be extended from 100 characters to 200), so which impact has changing the xml schema? Since Jaxb doesn’t generate annotations for facets you don’t need to change the java objects but other databindings might be.

For the restriction on the the number of elements/attributes this CANNOT be reflected in a Object Oriented language because it is not possible to restrict an extended class on inherited properties. Therefore I suggest to copy simplified elements (by keep only those mandatory and removing the optionals), in this way:

  1. you have less dependency
  2. the developer has less method generated (just those needed)
  3. you keep compatibility at the minimum if you want to convert the copied object into the original objects

Therefore I would recommend to avoid restrictions if possible, you can keep them for the sake of validation but you need to think about the impact on the objects generated with the databinding libraries. Such recommandation is also expressed by the HP XML schema best practices (search for “restrictions for complex types”) and in Microsoft xml schema design pattern.

 

 

 

Mailman and OpenDJ

For a project I started to add mailing list functionality to the development infrastructure, as requirement the mailing list should make use of the LDAP service.

In a previous article I explained that I chose OpenDJ to manage LDAP users and groups and for the management of mailing list I chose Mailman to be used with Postfix. As alternative I could have chosen Sympa but since Mailman is packaged already in many distribution and it doesn’t have many requirements like Sympa has, I preferred to go for Mailman. The drawback is that Mailman doesn’t come with LDAP support and I found a script which I adapted and uploaded on my github account.

To manage mailing list in OpenDJ you could opt for 3 ways:

  1. use the mailGroup object class
  2. use the groupOfUniqueNames object class together with the extensibleObject class
  3. customize your schema

OpenDJ has the schema definition of mailGroup coming from the Solaris system which allows  to enter the mail of the group and the mail of the users by using the attribute mgrpRFC822MailMember.

But…most of the time you will already have your users recorded in LDAP (using most probably the inetOrgPerson) therefore, instead of directly inserting users mail you would like to insert the DN of each user, in this way even PHPLdapAdmin allows you to insert easily each user without pasting email addresses.

Therefore I went for groupOfUniqueNames which allows me to define users with their DN and, by adding the extensibleObject class, I have the possibility to add the mail attribute.

In this way if I have already defined a group I can use it as well for mailing list without replicating information (e.g. mail address for each user with mailGroup).

After that I installed and configured Mailman and, as described above, I used a perl script which:

  1. extracts those groups that are groupOfUniqueNames and have a mail attribute set (&(objectClass=groupOfUniqueNames)(mail=*))
  2. extract all the users uniqueMember belonging to the group which have an mail address set and are not disabled (&(mail=*)(!(ds-pwp-account-disabled=*)))
  3. creates a mailing list which will have as name the CN group attribute (saved inside the variable $listname)

If you execute the script with the command:

perl cron-ldap.pl

you will get an output like:

------------------------------
Processing list: jenkins-users-mail
Found member: emidio
List jenkins-users-mail does not exist.
Creating new list jenkins-users-mail.

Syncing jenkins-users-mail...
Added : emidiostani@gmail.com
------------------------------

enjoy creating mailing lists !

Export Crowd users and groups to ldap

I need to migrate users and groups created in Atlassian Crowd to OpenDJ, an open source LDAP server written in Java fork of OpenDS; in this way Crowd will act just as SSO and it will rely on OpenDJ to store users.

Crowd can rely on several LDAP servers but it doesn’t have tools to export users and groups to ldap so I created a couple of queries (my database is PostgreSQL) to export in ldif files:

COPY(select 'dn: cn=' || user_name || ',ou=users,dc=my,dc=website,dc=com' || chr(10) ||
'objectClass: inetOrgPerson' || chr(10) ||
'objectClass: organizationalPerson' || chr(10) ||
'objectClass: person' || chr(10) ||
'objectClass: top' || chr(10) ||
'mail: ' || lower_email_address || chr(10) ||
'uid: ' || user_name || chr(10) ||
'cn: ' || user_name || chr(10) ||
'givenName: ' || first_name || chr(10) ||
'sn: ' || last_name || chr(10) ||
'displayName: ' || display_name || chr(10) ||
'userPassword: ' || credential || chr(10) || chr(10)
from cwd_user) TO '/tmp/users.ldif';

Of course you can customize the query to adapt to your needs, have in mind that the userPassword most probably is stored in PKCS5S2 format (you can see that the passwords extracted start with {PKCS5S2} prefix, see Crowd hashes), which is the default for Crowd or called Atlassian-security. In this sense only ApacheDS currently (2.0.0-m16) supports the same format while for OpenDJ there is a change to be done to the pbkdf2 storage scheme. It is up to you if you prefer to reset passwords or try to keep them during the migration.

As small parenthesis I have been asking to those people of Crowd about their connector for ApacheDS, since their connector supports ApacheDS 1.5.x (not anymore maintained by ApacheDS developers) and they are not sure to spend time in updating such connector for version 2.0, see my question on Atlassian Answers. An interesting thing to know is that also WSO2 Identity Server has bundled ApacheDS 1.5.7, you can see it also in the source code but you can connect it to OpenDJ.

In case your Crowd was setup with SSHA1, you can also use Openldap, see FAQ.

For the groups the query would be something like:

COPY(select 'dn: cn=' || parent_name || 'BASE_GROUPS' || chr(10) ||
'cn: ' || parent_name || chr(10) ||
'objectClass: groupOfUniqueNames' || chr(10) ||
'objectClass: top' || chr(10) ||
'uniquemember:' array_to_string(array_agg(child_name),',')
from cwd_membership group by parent_name) TO '/tmp/groups.ldif';

As you can see I used the functions array_to_string and array_agg integrated in PostgreSQL to create the list of users. These users will not have their base (,ou=users,dc=my,dc=website,dc=com), so, after executing the query, you can apply some regular expression (I used Notepad++) to replace the regex “([,])” with “,ou=users,dc=my,dc=website,dc=com\nuniquemember: cn” which replaces all the “,” inserted with the query with their base. At the end, I replaced BASE_GROUPS with “,ou=groups,dc=my,dc=website,dc=com“.

You can import the ldif files with PHPLdapAdmin or directly in OpenDJ with import-ldif command, keep in mind that you need to allow pre encoded passwords before importing.

Enjoy the migration !