Category: Open Source

A Pulse on the Cassandra Community

Posted by on April 08, 2015

IMG_7817

Standing Room Only

I recently attended a Cassandra Day community event. If the crowd was any indication of the install base, I would have to say the interest is strong to quite strong. The weather was miserable in Atlanta and people still managed to show up!

You can spend a lot of time reading articles and various ‘expert’ opinions across the technology world on what is the latest and greatest tool. In addition to reading I would encourage you to collect your own data points and GO TO a community event. You’re able to feel the energy and see first hand the momentum of a product – or lack thereof. Vendor pitches are kept to a minimum and the content tends to be well thought out.

This was my first Cassandra dedicated event but not my first exposure to the Database. The schedule looked typical from the technical side and the morning even included a ‘business’ session that included use-case examples. I made an effort to see a little of both.

The business track was kicked off with your traditional endless amounts of data sales pitches and an introduction to DataStax, the commercial company that offers support and tooling for Cassandra. Standard vendor conference stuff.

So who is using Cassandra? I was pleasantly surprised to hear and learn about companies other than Netflix. Don’t interpret that negatively. Netflix is a great spokesperson for the Cassandra community and helped launch them towards the top of the NoSQL leaderboard. Netflix’s scale is impressive but there is nothing wrong with a little variety. Ask and you shall receive. Target, Safeway and Kroger were all presented as current users.

I found Safeway’s Cassandra use most interesting. They are building an app that takes your shopping list, locates the nearest store and maps product-to-aisle within that store. Cool to see grocery chains embrace technology and try and solve a problem every person has struggled with at one point or another. During the presentation one question stood out in my mind – Why no Mongo? Geolocation has been a very strong use-case for MongoDB since the early days of the product. Unfortunately, the overview was light and the entire story was not shared – but it makes me wonder if Cassandra’s closing the gap on MongoDB faster than what the market perceives.

After DataStax we heard from Asurion (you know, the cell phone insurance folks – you lose or break your phone and they send you a new one for $200), who shared their journey to Cassandra. They followed a typical path through NoSQL enlightenment. RDBMS was failing on scale, along with other things, and they needed a fix.

Asurion’s story: step 1, Postgres to MongoDB. Why didn’t that work for them? It boiled down to lack of understanding MongoDB and assuming it was a drop-in replacement for RDBMS(applying the same relational principles to NoSQL and expecting a better outcome), cites the VP at Asurion. Operational tooling for MongoDB was also another reason why it wasn’t selected. Lack of tooling created a broader challenge when it came to scaling internal resources and their ability to use the tool.

But why Cassandra? DataStax’s operational tooling, analytics strength and scaling capabilities were all mentioned as reasons why – and why they continue to build on Cassandra. I wouldn’t get hung up on the analytics and scale comments since these mean different things to different people. Takeaway, DataStax’s focus on operational tooling continues to be a valuable strategy in gaining new customers.

If MongoDB was evaluated second would it have prevailed? Maybe, Maybe Not.

Data Modeling 101 was the most memorable session for me. When you walk into a large conference room and there are no available seats – you know you’re in for a good session. I’m not exaggerating – there weren’t even the awkward middle seats that people leave open to avoid sitting close to strangers. (see picture at the top)

The speaker was patient and managed the large crowd well. There were plenty of relational modeling to Cassandra modeling references throughout his session.

Now for the recap. So what did I learn?

  • Cassandra’s interest continues to grow
  • DataStax is growing (employees and customers)
  • Hadoop compatibility is top of mind

Cassandra still has ground to catch on MongoDB if you follow db.engines rankings. DataStax growing is good for the community and product evolution. When businesses bet their livelihood on an Apache licensed open source product the community will benefit more often than not. When exploring a new technology the ecosystem can be a great indicator of the current health but DON’T just focus on quantity. The ecosystem is a good indicator of if the technology will a) last and b) prosper. Finally, Hadoop compatibility will continue to be an important piece of the NoSQL conversation and DataStax’s seems to be recognizing that and making it a priority.

It was awesome to see the enthusiasm behind Cassandra in person. NoSQL concepts and products are still new to a lot of people. It is going to take time to make a dent in the RDBMS world but it’s happening. As tools like Cassandra and MongoDB continue to put pressure on the relational databases, users will win. Developers will continue to push database technologies to the limits – therefore forcing products to evolve for the good or disappear.

Implement Faceted Search with Solr and Crafter WEM

Posted by on March 25, 2013

Crafter Engine, the delivery component of Crafter Rivet Web Experience Management provides powerful out-of-the-box search capabilities based on Apache Solr.  Solr is extremely fast and provides a wide range of capabilities that include fuzzy matching, full-text indexing of binary document formats, match highlighting, pagination, “did you mean”, typed fields and of course faceted search. Faceted search (aka faceted navigation) is an ability of the search interface to break down a search in to categories that allow a user to filter and narrow down the number of results by selecting only those category values that are relevant.

faceted-search

Before we get in to the construction of a faceted search let’s take a quick step back and look at some basic architecture.

The first thing to think about is the type of thing we’re going to be searching on.  From a web content management perspective, this is often referred to as the content model. A content model in its most basic form is just the description of an entity like an article and its properties such as title, author, publish date, body and so on.    In the figure above we see a search-driven UI that allows the user to narrow down a collection of jeans by size, color and fit.  In order to enable this we have to “model” the jeans.  These filters are criteria that must be associated with each instance of the content type.  Each field (color, size, fit) has many possible values that are selected by an author when a jean object is created.

product-model

In the figure above you can see just a small portion of the Jeans product content type in the Crafter Studio drag and drop content type management tool.  Note the fields for size, color and the data sources that pull values for these fields from managed taxonomies.

Once we’ve created our content type we can now create instances of jeans, provide the details for the product and select the criteria that correctly categorizes the pair of jeans.

select-criteria

Whenever an object is published from Crafter Studio (the content authoring environment) to Crafter Engine (the delivery platform), it is immediately indexed by Solr with the help of Crafter Engine’s deployment Solr plug-in.  Once published Solr is aware of each category and selected values for that category.

Now that we have content indexed in Solr we can build a search page. We’re going to build the Jeans category page from the first figure. All of the coding will be done in the Freemarker template language supported by Crafter Engine. For our example we’ll keep the implementation very straightforward without any abstraction.  Advanced coders my choose to factor and encapsulate the code differently.

To begin, create or navigate to your category page content type (standard fields are fine) and then open the template editor.  For a more in-depth tutorial on basic content modeling click here.

template-editor

Now that we have our template editor open and we’re ready to begin coding. Let’s start with a review some basic requirements.

  • We need to maintain or store the user’s selections for the various filters so that they persist from one search execution to another.
  • We need allow the user to simultaneously filter all three categories (color, size, fit)
  • We want to provide the user with a count of the number of items available for each category value
  • We need to provide sorting (in our case price high to low, price low to high, and by arrival date)
  • We need to provide pagination (showing n results per page)

Maintaining the user’s selection

How you choose to maintain the user’s selections so that they are available across search executions is largely a function of a few factors:

  • How long do the values need to persist:  Only so long as the user is on the page? For the session? Whenever they visit the site?
  • How sensitive is the value being stored?
  • How are you refreshing the results: page reload or Ajax?

You have many options from simple JavaScript values that will be maintained only as long as the user does not leave or refresh the page to cookies, sessions and profiles each of which have their own life-cycle and security attributes.

For our example we’re going to store the values in a cookie.  This requires no additional configuration and persists across several visits.  To do this we’ll need the following code:

Create template variables with current cookie values

As you can see, the code simply creates a template value for each user selection based on the value from the cookie.  If no cookie is found a default value (specified by !”FOO”) is provided. This code would typically appear close to the top of the template.

<#assign sort = (Cookies["category-sort"]!"")?replace("-", " ")>
<#assign filterSize = (Cookies["category-filter-size"]!"*")>
<#assign filterColor = (Cookies["category-filter-color"]!"*")>

Render controls with values selected from cookies

Now we need to build the filter controls for our users so that they can narrow their searches. In the code below we’re iterating over the available options (we’ll show how these are acquired in just a moment) and creating the options for the select component.  For each option we look to see if it is the currently selected item and if so we mark it as selected.

<select style="width: 90px"  onchange="setCookie('category-filter-color', this.value);">
   <option <#if filterColor=='*'>selected</#if> value="*">Color</option>
   <#list colors?keys as colorOption>
      <option <#if filterColor==colorOption>selected</#if> value="${colorOption}">${colorOption} (${colors[colorOption]})</option>
   </#list>
</select>

Provide a  mechanism to save a selected value to our cookie and force a refresh

In the code above you can see a simple JavaScript function on the “onChange” method for the select control.  Again you can see here we’re keeping the code as abstraction free as possible to make the example clear.  Below is the simple JavaScript function:

<script>
  var setCookie = function(name, value) {
    document.cookie = name + "=" + value + "; path=/;"; 
    document.location = document.location;
    return false;
  }
</script>

Building the Query and Filter Options

Now that we have a mechanism for choosing criteria it’s time to use those values to create and execute a query.  In the section below we’ll look at how queries are built and executed through the Solr-powered Crafter Search interface.

Construct a query that is NOT constrained by filters.

We will use the results of this query to get the possible values and counts for our filters.
Below you can see we’re building up a simple query for the jeans content type, gender and collection.

<#assign queryStatement = 'content-type:"/component/jeans" ' />
<#assign queryStatement = queryStatement + 'AND gender.item.key:"' + gender + '" ' />
<#assign queryStatement = queryStatement + 'AND category:"' + category + '" ' /> 
<#assign queryStatement = queryStatement + 'AND collection.item.key:"' + collection + '" ' />

Construct a query based on the first but with additional filter constraints

We will use the results of this query to display the results to the user.

<#assign filteredQueryStatement = queryStatement />
<#assign filteredQueryStatement = filteredQueryStatement + ‘AND size.item.value:”‘ + filterSize + ‘” ‘ />
<#assign filteredQueryStatement = filteredQueryStatement + ‘AND color:”‘ + filterColor + ‘” ‘ />

Execute the unfiltered query

Here you can see we’re declaring the facets we want the counts on.

<#assign query = searchService.createQuery()>
<#assign query = query.setQuery(queryStatement) />
<#assign query = query.addParam("facet","on") />
<#assign query = query.addParam("facet.field","size.item.value") />
<#assign query = query.addParam("facet.field","color") />
<#assign executedQuery = searchService.search(query) />

Execute the filtered query

Here you can see we’re declaring the pagination and sorting options.

<#assign filteredQuery = searchService.createQuery()>
<#assign filteredQuery = filteredQuery.setQuery(filteredQueryStatement) />
<#assign filteredQuery = filteredQuery.setStart(pageNum)>
<#assign filteredQuery = filteredQuery.setRows(productsPerPage)>
<#if sort?? && sort != "">
 <#assign filteredQuery = filteredQuery.addParam("sort","" + sort) />
 </#if>
<#assign executedFilteredQuery = searchService.search(filteredQuery) />

Assign the results to template variables

Below you can see the how we’re getting the matching jean objects, and number of results returned from the filtered query response.  You can also see how we’re getting the available options and counts from the unfiltered query response.

<#assign productsFound = executedFilteredQuery.response.numFound>
<#assign products = executedFilteredQuery.response.documents />
<#assign sizes = executedQuery.facet_counts.facet_fields['size.item.value'] />
<#assign colors = executedQuery.facet_counts.facet_fields['color'] />

Displaying the Results

Display the products

In the code below, we’re iterating over the available products and simply displaying the details for it.

 <#list products as product>
    <#assign productId = product.localId?substring(product.localId?last_index_of("/")+1)?replace('.xml','')>
    <@ice componentPath=product.localId />

    <div>
       <img src="${product.frontImage}" />
       <div style='width:170px;'><a href="/womens/jeans/details?p=${productId}">${product.productTitle}</a></div>
      <div>${product.price_d?string.currency}</div>
      <div>                                
          <@facebookLike contentUrl='http://www.rosiesrivets.com/womens/jeans/details?p=${productId}' width="75" faces="false" layout="button_count"/>
       </div>
   </div>
</#list>

Construct pagination

Given the number of items found and our productsPerPage value we can determine the number of pages to show to the user.

<div>
    <ul>
        <#assign pages = (productsFound / productsPerPage)?round />
        <#if pages == 0><#assign pages = 1 /></#if>        
        <#list 1..pages as count>
            <li <#if count=(pageNum+1) >class="active"</#if>><a href="${uri}?p=${count}">${count}</a></li>
        </#list>
    </ul>
</div>

faceted-search

Alfresco Cloud’s Key Capabilities

Posted by on March 15, 2013

SaaS Based Collaboration

The first aspect and most basic use of Alfresco Cloud is as a cloud hosted collaboration application for your organization.  Alfresco Cloud is multi-tenant and can host as many organizations (which Alfresco calls networks) and project spaces within each of those networks as is needed.

In the illustration below you can see two independent organizations each with several project teams working independently on the Alfresco Cloud.

 

If you need to spin up a simple collaboration environment for your department Alfresco Cloud is a great solution.  Alfresco Cloud is affordable and based on per user pricing.  There is zero software to install or setup and you get a ton of really rich collaborative features from document libraries to wikis, calendars, blogs and much more.

Cross-Organization Collaboration

Where things start to get really interesting, however, is with cross-organization.  With Alfresco Cloud you can manage content between organizations to enable B2B interactions between knowledge workers from the different organizations – again all with zero infrastructure setup.

In the illustration below you can see a project team from each organization collaborating with one another through Alfresco Cloud’s permissions which ensure that only that content which should be shared is in fact shared.

Alfresco One: Private – Public Cloud Sync

The thing is that not all content is meant to live in the cloud.  Organizations of all sizes generally have some content they still feel needs to be controlled and secured inside the firewall or as is often the case, there are integrations with critical business systems that are mandatory and those integrations are only possible between systems located within our firewalls.

With Alfresco Cloud this is no issue.  You can setup and host your own private infrastructure internally which serves as the system of record and hosts all of your content including those items which must remain internal and for content you want to collaborate on with organizations outside the firewall you can create a synchronization (using Alfresco One) with Alfresco Cloud and synchronize specific content between your organizations private infrastructure and the cloud to facilitate the collaboration.

In the illustration above we have a private infrastructure on the left and the cloud on the right. You can see that some project teams are working only against this internal infrastructure while others may work only against the cloud.   And we can see a secure, relationship between our internal infrastructure on the left with the Alfresco Cloud on the right.  This synchronization is enabling our teams to collaborate with one another regardless of whether they are working on public or private infrastructure.

Remote API for the Cloud

And finally Alfresco Cloud supports a remote application programming interface or API which is based on CMIS (Content Management Interoperability Standard) and a few additional Alfresco specific non-CMIS APIs.

This is a real game changer because it means that collaboration no longer has to take place through the user interface but as we can see here in the diagram we can enable applications and automated processes to participate in our collaborations – and because we have a sync between private a public cloud infrastructure we’re not just talking about cloud based content storage here – which is great in its own right — we also have a very powerful integration platform.

When you combine the API and the public/private sync what you gain is infrastructure akin to an integration bus.

 

 

 

Alfresco Cloud is much more than meets the eye

Posted by on February 28, 2013

As many of you know Alfresco introduced its cloud offering almost a year ago. At the time of this writing there are a number of unique ways you can interact with Alfresco Cloud:

  • Collaborative SaaS (Software as a Service) application. Teams to quickly spin up collaborative spaces (in Alfresco’s Share application) and begin working together with zero on premise software.
  • Members of the cloud can join multiple networks which enables them to work and collaborate across organizational boundaries.
  • Custom applications can use Alfresco Cloud as a store. You can interact with cloud through an API (CMIS and Alfresco specific RESTful APIS.)
  • And you can sync content between your on-premise instance of Alfresco and the networks with-in cloud that you belong to.

Share in the cloud as a SaaS offering is a pretty obvious play. There is a lot of value in this simple use case for organizations that need good collaboration tools but just don’t have the appetite for or enough user volume to justify hosting their own infrastructure.

When you combine this SaaS offering with the ability to securely and selectively collaborate with other organizations, you are now enabling all kinds of people-oriented B2B interactions that can be extremely difficult when you have a system that is stuck behind a firewall.

Add to that an API to that and now it’s not just people-oriented B2B and internal interactions that can take place, it’s automation and rich behavior.  At this point Alfresco in the cloud is no longer an application.  It’s a bus.

Now not all content was meant to live outside the firewall and not all systems can or even should live/reach outside the firewall.  With Alfresco’s “cloud sync” capability Alfresco Cloud closes this gap by allowing organizations to selectively and securely sync specific content between an on-premise instance and the cloud.  This is extremely exciting because it opens up a whole new realm of possibilities for B2B integration and mobile enablement.

If you’re thinking about Alfresco Cloud as a simple collaboration application or a simple cloud based content store it’s time to rethink.  Alfresco Cloud paired with Alfresco on premise is an extremely exciting hybrid architecture and integration middle-ware that opens up use cases which have traditionally required dedicated business to business infrastructures that were difficult to get approved let alone set up: basically not possible.

On March 14th Rivet Logic will co-host a webinar with Alfresco entitled Using Alfrescos Hybrid Cloud Architecture for Better Web Content Management where we will discuss and demonstrate how hybrid architectures can be applied in a WCM (Web Content Management) context to enable collaboration with external partners like agencies and for integrations with other content services, providers and consumers such as AP, Routers and the like.  While WCM use cases will be the focus of the conversation, the topic is perfect for anyone interested in learning more about Alfresco hybrid architectures.  See you there!

https://www.alfresco.com/events/webinars/using-alfrescos-hybrid-cloud-architecture-better-web-content-management

Web CMS and Digital Assets: Crafter Rivet / Alfresco Integration with Adobe Photoshop

Posted by on February 21, 2013

Digital assets are a key component of almost all web experience and customer engagement projects. In today’s era of engagement with all of the additional content targeting, personalization, internationalization and multi-channel publishing the number and permutation of digital assets associated with any given project are growing rapidly.  This trend will only continue as we move forward.  Content workers (authors, designers, content mangers) need to be able to create, locate, modify and manage the growing number of assets easily and efficiently in order to maintain brand quality and deliver projects on time and on budget.

In today’s blog entry we’re going to focus on the creative side of WCM (Web Content Management) and DAM (Digital Asset Management) even though this is only a small portion of the overall set of use cases.

Let’s begin by considering the following example use cases:

  • Create mobile appropriate image resolution variants
  • Create video stills
  • Imprint watermarks
  • Thumbnails for galleries and promotional kickers

Each of these use cases are important ingredients in providing the user with a great experience but they also introduce a lot of additional work for our content teams.  One of the ways to deal with the large volume of asset creation and manipulation responsibilities is to automate them.   The use cases mentioned above and many others like them are a perfect candidate for automation.

Crafter Rivet leverages Alfresco’s enterprise content management services for image transformation. With a few simple rules applied at the repository level it’s possible to provide your content team with image resolution variants, video stills, apply watermarks, to scale and crop thumbnails and then to make these assets available for review by our authors all in an automated fashion with no additional labor required beyond uploading the canonical assets.

Another important way to help our content teams cope with the sheer volume of digital asset related workload is to make sure our teams are able to work with the very best tools at their disposal.  With today’s modern browsers it is possible to provide a fairly decent set of tools / asset manipulation functionality right with-in the browser.  However, while purely web-based tools have their advantages they are often slower and much less powerful than the desktop tools serious content contributors are used to working with.

The biggest productivity boosts are gained when we empower our designers and other content workers on our team with rich, native tools that they are already familiar with and work  with on a daily basis.

Adobe’s creative suite (which contains tools like PhotoShop) is the quintessential software package for image/digital asset creation and manipulation.  Designers are deeply familiar with these tools and are able to leverage their enormous arsenal of capability to accomplish a tremendous amount of work in a short amount of time.  The issue that many organizations often face, is that while the tools themselves are great, the interfaces between the tools and the systems that ultimately store, manage and deliver the assets are either non-existent, human-process based, or have clunky integration. This gap creates a drag on the margin of productivity and introduces room for error.

Fortunately Alfresco, Adobe and Crafter Rivet Web Experience Management have a solution that connects rich, creative desktop tools,  to your systems of record (e.g. repository) and ultimately to your systems of engagement (e.g. website) in a seamless fashion.  Content creators work right with-in the rich, local tools that they are familiar and productive with and those tools are deeply integrated with the repository which means that all of the organization, policies, metadata extraction, and versioning provided by the repository etc are seamlessly integrated and enforced.  Alfresco is a CMIS (Content Management Interoperability Standard) compliant repository.  This standards based interface enables it to communicate with external applications like Adobe’s products in order to interact with and operate on the content, metadata, permissions, versions and so on housed within the repository.   Adobe provides a platform called Adobe Drive which enables its tools to connect in a rich fashion over CMIS to Alfresco.  Once we’ve connected our Adobe tools and our Alfresco repository authors working within Crafter Studio, the authoring and management component of Crafter Rivet can now see content updates coming from the Adobe tools right in context with the work they are doing through in context preview and editing. They can also interact with that content through the web based tools, workflow, versioning, metadata capture and publishing capabilities of Crafter Studio.

By closing the integration gap we can now provide powerful tools for productivity and at the same time do so in a way that makes it seamless and easy for our creative teams to collaborate across the entire process.

Click on the video below to see Adobe and Crafter Rivet WEM / Alfresco in action together!

Video of Photoshop altering images in Crafter Rivet Web CMS and Alfresco

 

Crafter Rivet is a 100% open source, java based web CMS for web experience management and customer engagement.  Learn more about Crafter Rivet at crafterrivet.org

Web CMS Content Enrichment with OpenCalais, Crafter Rivet and Alfresco

Posted by on February 15, 2013

Content enrichment is the process of mining content in order to add additional value to it.  A few examples of content enrichment include: entity extraction, topic detection, SEO (Search Engine Optimization,) and sentiment analysis.  Entity extraction is the process of identifying unique entities like people and places and tagging the content with it.  Topic detection looks at the content and determines to some probabilistic measure what the content is about.  SEO enrichment will look at the content and suggest edits and keywords that will boost the content’s search engine performance. Sentiment analysis can determine the tone or polarity (negative or positive) of the content.

Content enrichment provides an enormous opportunity to improve the effectiveness of your content.  However, it is clear that detailed analysis and the work of adding additional markup and metadata to content can be extremely time consuming for authors and content managers.  Fortunately there are many free and commercial services available that can be used to enrich your content while saving countless hours for authors and content managers.

One such service is OpenCalais from Thompson Reuters.  Open Calais is a toolkit of capabilities that includes state-of-the-art semantic data mining of content via restful services.

In the video below you’ll find a short demonstration of how OpenCalais can quickly be integrated with Crafter Rivet’s authoring platform (Crafter Studio) to make it extremely fast and easy for authors to enrich articles and other types of content with rich, structured metadata.

Extending Crafter Engine with Java backed functionality

Posted by on October 06, 2012

Crafter Engine is the high-performance website / web app delivery engine for Crafter Rivet Web Experience Management. Out of the box Crafter Engine ships with support for many of the types of engaging functionality you have come to expect from a WCM/WEM platform.  However, there are times when we want to add additional capabilities to integrate with internal and 3rd party systems to meet specific business objectives.  In this article we’ll demonstrate how you can create Java backed plug-ins for Crafter Engine.

To illustrate the integration process we’ll integrate a simple RSS reader based on the ROME RSS processing library
You can download this example (and others) at the following SVN location: https://svn.rivetlogic.com/repos/crafter-community/ext/engine

To begin let’s start with some background on Crafter Rivet, Crafter Engine and how a plug-in is organized structurally:

Crafter Rivet

Crafter Rivet is web experience management solution based on the Alfresco Content Management platform with a de-coupled architecture.  This means that the authoring environment and the production delivery environment are separate infrastructure integrated by workflow and deployment from authoring to delivery.  Crafter Rivet components power some of the internet’s largest websites.  A decouple architecture makes things easier to support, more flexible and very scalable.

Crafter Engine

Crafter Engine owes its performance characteristics to its simplicity.  Content is stored on disk and is served from memory.  Dynamic support is backed by Apache Solr. At the heart of Crafter Engine is Spring MVC, a simple, high performance application framework based on one of the world’s most popular technologies: Spring framework.

What is a Crafter Engine Plug-in

A Crafter Plug-in is a mechanism for extending the core capabilities of Crafter Engine, the delivery component of Crafter Rivet. Plug-ins allow you to add additional services and make them available to your template / presentation layer.

An example plug-in as we suggested above might be an XML reader which would function as specified below:

  • The plugin would expose a in-process Java based service like Feed RssReaderService.getFeed(String url)
  • Your your presentation templates would then simply call <#assign feedItems = RssReaderService.getFeed(“blogs.rivetlogic.com”) />

Anatomy of a Crafter Engine Plug-in

You will note from the diagram above that plug-ins are simple JAR files that contain both Java code, configuration and an Spring bean factory XML file that loads and registers the services with Crafter Engine.

Loading the plug-in

1. When the container loads the Crafter Engine WAR the shared classes lib folder is included in the class path.

2. When the Crafter Engine WAR starts up it will scan the class path for its own spring files and any available plug-ins. At this time the services described in the Spring bean XML file with in your plug-in will be loaded. Your service interfaces are now available in the presentation layer.

Interacting with your service

A. The user makes a request for a given URL

B. Crafter Engine will load all of the content descriptors from disk for the page and components needed to render the specific URL. Once the descriptors are loaded the templates for the page and components will loaded from disk.

C. The templates may now call your service interfaces to render from and interact with your back-end code. Your Java backed service may then do whatever it was intended to do, returning the results to the template for processing. Once complete, the responses are then returned to the user in the form of a rendered web page.

A simple Example

Now that we have a bit of background, let’s get down to the nuts and bolts of the matter and build, install and configure our RSS reader integration.

The Java Service

/**
 * Reads a RSS or Atom feed from a specified URL and returns a data structure that can be accessed from a template.
 */
public class RssReaderService {

    public SyndFeed getFeed(String url) throws IOException, FeedException {
        SyndFeedInput input = new SyndFeedInput();
        XmlReader reader = new XmlReader(new URL(url));

        try {
            return input.build(reader);
        } finally {
            try {
                reader.close();
            } catch (IOException err) {
                // handle error
            }
        }
    }
}

The Configuration

/crafter/engine/extension/services-context.xml:

<?xml version="1.0" encoding="UTF-8"?>
<beans xmlns="http://www.springframework.org/schema/beans"
       xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
       xsi:schemaLocation="http://www.springframework.org/schema/beans http://www.springframework.org/schema/beans/spring-beans-3.1.xsd">

    <bean id="rssReaderService" class="org.rivetlogic.crafter.engine.rss.RssReaderService" />
</beans>

Packaging and Install

The compiled class and spring configuration file must be found in the class path.  To facilitate this we build these objects and their dependencies in to a single JAR file.  You can find the build process for these at the SVN location above.  Place the JAR file that is produced by the build process in to your shared class path for example: /TOMCAT-HOME/shared/lib and restart the application.

Now that you have restarted the application your service is available to your presentation layer.  This means that in any template you can now add the following:

<#assign feedItems = RssReaderService.getFeed(“blogs.rivetlogic.com”) />

Using the Plug-in:

Web Ninja: Create a RSS Feed Component for Authors

Create a new content type as a component:

Create a template for the RSS Widget by clicking on the type canvas and editing the template property:

Author: create and configure a RSS Component

Modify the component properties through the content type form:

And insert the widget in to a web page:

Save and close and the configured component is now showing on your web page!

Resources for Crafter Rivet – Web CMS for Alfresco 4

Posted by on April 16, 2012

Last week was a busy week for those of us working on Crafter Rivet, the WEM/WCM extension for Alfresco 4.0.  We’re extremely excited about this release and are busy scheduling events and demos as word is starting to get out!

You can download Crafter Rivet here:

If you missed our Webinar last week that was co-hosted with Alfresco you can check it out here:
http://www2.alfresco.com/Crafter0412

For existing Alfresco WCM customers on Alfresco version 2 and 3 using the AVM based solution, we’ve put together a couple of blogs to help you think about your migration to Alfresco 4 and the core repository:

For everyone who wants to learn more about this exciting and powerful open source solution for web content and experience management that sits on top of Alfresco, the world’s most open and powerful content management platform, we’re hosting a Crafter Rivet Roadshow in a city near you in May!  Come on out for content packed presentations, demonstrations, Q & A and free lunch!

Sign-up for the Crafter Rivet Roadshow here!

Crafter Roadshow Dates:

San Francisco
Tues. May 8

Los Angeles
Wed. May 9

Chicago
Tues. May 15

New York
Wed. May 16

Boston
Thur. May 17

Washington DC
Tues. May 22

Web CMS on The Alfresco Core Repository (Part 2 of 2)

Posted by on April 10, 2012

In yesterday’s post we covered the fact that Alfresco stopped selling the AVM based WCM solution to new customers.  Existing customers using the AVM based approach will continue to receive support until the AVM reaches end of life status.  New customers looking to Alfresco for WCM/WEM capabilities who read this will naturally wonder what is the approach to WCM on Alfresco.  Existing customers will want to know how to migrate off the AVM and in to Alfresco’s core repository.

As we have seen in yesterday’s post, the core repository has had the benefit of continuous innovation through which it has grown as a platform now capable of supporting use cases critical to WCM/WEM with key features like remote file deployments and form based content capture.  Along with features clearly directed at the WCM use cases, the core repository is host to an amazing array of features and capabilities that make it ideal for all manner of enterprise management, WCM/WEM included.

At Rivet Logic we have made a significant investment in a web content and experience management extension to Alfresco that we call Crafter Rivet. Crafter Rivet is a 100% open source,  full featured application that powers 100s of websites and has over 40 man years of development effort invested in it. Initially Crafter Rivet’s authoring and management capability was based on the AVM. When Alfresco made the decision to direct the full force of its innovation on the core repository we knew it was time to covert. Just released, Crafter Rivet 2.0 is 100% based on the core repository, Solr search and Actviti workflow for Alfresco 4 CE and EE.  Making the switch from the AVM to the core repository required a deep examination of our use cases and the features we would have on hand within the core repository.

Now that we have a background understanding of both the AVM and core repository features we discussed yesterday it is time to look at the use cases that the AVM was designed to address.  We will discuss the use case and how these gaps were addressed by Crafter Rivet.  Let’s get started!

Use Case: Sandboxing
As described in yesterday’s blog, sandboxing is a feature but there are use cases that drive this feature. In a Web CMS context we have times when we need to modify templates, CSS and other assets that have far reaching effects on our sites. There are three common use cases that point to sandboxing:

  • Development:  As a developer I want to work with a template, a CSS file, a JavaScript library, etc. without worrying that the bugs I will inevitably create will interfere with the rest of the teams ability to produce, manage, preview and publish content.
  • Timing: On websites of significant size and complexity, there are often times when projects are created to update the look and feel of the site.  These projects with future delivery dates need to be able to take place without interfering with daily publishing.  Further, it’s important that the project be able keep up with on-going updates to reduce the headache of last minute merges.
  • Playground:  Sometimes we just want to play.  Sandboxes allow us to enable team members to innovate and play around without fear of impacting the live website.

It’s clear that the ability to sandbox (or branch/merge) your website code base can be pretty handy. In my mind there are several key questions:

  • Is the support for this use case a “must have” for my specific environment?  How often do I run in to the use cases above?
  • What granularity of sandboxing do I need?

Many popular Web content management systems do not natively support sandboxing. In a lot of cases the need for the branch merge capability is handled through the development and deployment process. In general I think it safe to say this feature is a rather strong “nice to have” unless you have an site with look and feel components which are literally being constantly updated and where the traditional development process would add too much time to the effort.

When you do need sandboxes the next question is granularity and how sandboxes are used. The AVM UI dictates sandboxes for each user. In my experience accumulated over many Alfresco WCM engagements; is that this was too fined grained for the needs of most engagements.  Most users want to work directly in context with other users.  They need basic locking on individual assets to keep work safe during their edits but they don’t require an entirely separate and parallel universe.  The ability to create a sandbox ad-hoc for a specific purpose maps more directly to the needs we see on the ground. In other words, a sandbox is too granular but a sandbox for a project to update the look and feel of the entire site where users could work together would more aptly address the kind of needs we see.

Crafter Rivet starts with the first finding, that sandboxing is not a “must have” feature and that in-fact when it is applied it should be done so to facilitate specific projects and specific ad-hoc needs. If you look at the way we have structured our content in the core repository you will see we have left room to support one or more draft copies of the site.  In v2.0 we do not support layering in the default configuration; however, Crafter Engine, our content delivery and preview tier, is able to sit over top of multiple hierarchical stores and present them as one store much in the same way the AVM did.

Use Case: History and Reversion in a Web CMS Context
As a user, when I want to preview a version of a specific asset, let’s say a page,  I want to see the page as it was on that day. That means I want to see exactly the same components and assets (images, CSS, js etc) as they were on that given day.  This is a real challenge in the core repository because there is no native support linking assets together under a common version; each asset is individually versioned and the links between objects (associations) do not capture version.

Now to be honest, I have simplified the problem a bit to make a point.  I said that pages, for example, are compound assets and that you are always interested in seeing their dependencies at a given point in time.  This is often the case when we’re talking about images and components that are specific to the page but it’s not really the case when we’re talking about shared assets like templates, CSS, JavaScript, and shared components and shared collateral.  Think for a moment about why users want to look at and ultimately revert to previous versions of content.  They are doing so either:

  • In the short term because they have made a mistake, or
  • in the long term because they need to revert messaging to a previous point in time.

In the first instance there is likely to be no issue.  Common assets are likely going to be the same as they where at the point in time of the version.  However, in the second case, we really want to see the old content in the context of the current templates, etc.  If we revert, it’s to get the older content, but we’re going to deploy it in the context of the latest look and feel of our site.

Handling versioning in Web CMS is a must have, must do capability.

In Crafter Rivet we considered these use cases fully and drew a distinction between two types of relationships.  Those relationships which are page only  and those which are shared commonly amongst other objects in the system.  When you revert a page or any other object, those relationships which are “page only” will revert to a point in time, while other relationships that are common will continue to align to the latest version of the asset.

To accomplish this we leverage both the native versioning capability of the core repository as well as file organization and patterns.  In short, we organize page-only assets in folder structures that represent the page. Page objects are given names that are based on a hash of the object to guarantee a unique name which means in effect that the versioning of the page only object is “horizontal” in the repository. By horizontal I mean that a new file path is used rather than relying on the version store.  Shared objects like pages or other common assets are stored regularly and rely on the native versioning support.  If you revert a page you will revert to an a state where the page points to a different set of file / file paths — achieving a solution for both use cases we mentioned above.

Snapshots
There are several Web CMS use cases that could require snaphsots and version diffs.  For example, some websites have compliance related issues and thus must maintain versions of their sites so that in the event of a dispute over information communicated via the site they can easily prove what the site looked like at a particular moment in time.  The question for snapshots is:

  • Is this something your organization must have?
  • And if so, is it something that the repository has to do for you?

Our experience shows that this feature, for the general market, is a nice to have.  Most customers don’t take advantage of this capability.  When we looked at this capability in Crafter Rivet, we decided it was not important to support natively within the repository itself.  If a customer needs a snapshot every day we simply include a deployment target that would produce a snapshot.

For those wondering about snapshot rollback; our experience has shown that this particular feature is really not relevant to most customers in day to day operation.  The feature has come in handy as a mechanism for bailing out people who have made sweeping changes to a site (100s of files) and deployed them with little or no QA only to find a broken website after the fact.  In such a case, snapshotting a rollback is a life saver.  With a click of a button you can revert 100s of files.

Crafter Rivet, by design is 100% file based. In such a crisis scenario, a simple file based backup could be used to restore a Crafter Rivet based site to a former state.  In the repository, you are unlikely to desire an actual rollback.  It’s more likely that you will want to keep the broken state and simply fix what is wrong and then redeploy the working site.

Moving Forward

Alfresco v4 is an incredible platform and the move to the core repository unlocks all of that capability and innovation. Crafter Rivet is a platform that made use of all of the functionality in the AVM.  And with our new release, we made the move.  You can as well.  More importantly, if you are using the AVM with Alfresco v3 (or even V2), then Crafter Rivet is the perfect solution for your upgrade.  We can provide parity for most needs with a much better user experience that goes way beyond basic Web CMS needs with the coverage of WEM use cases like integrated analytics and reporting, native mobile application authoring, preview, and presentation, content targeting and personalization, multi-channel publishing and much more.  If you’re new customer to Alfresco looking for Web CMS solutions, Crafter Rivet is a comprehensive WCM/WEM solution, with features that rival some of the major players in the industry.

Click here to learn more about Crafter Rivet

Click here to sign up for a our webinar “Crafter Rivet – The WEM Solution for Alfresco 4″ on April 12th at 1pm

Web CMS on The Alfresco Core Repository (Part 1 of 2)

Posted by on April 09, 2012

I was on a call today with a long time Alfresco WCM customer who would like to upgrade from 3.x to 4.x.  They have been using the Alfresco WCM solution provided by Alfresco in 2006 for many years. As many of you know Alfresco’s original WCM solution is based on a technology called the AVM – Alternative Versioning Model.  The AVM is a separate repository and associated user interface that was constructed to handle a number of WCM related use cases. Alfresco has since enhanced their offering in their “Document Management” repository, which now handles WCM use cases.  As a result, Alfresco has now announced that  the AVM will no longer be offered to new customers.  In discussing the upgrade with our clients on the call today. “It’s time to move to the ‘DM’” was the most responsible message to provide. Existing customers won’t lose support over night, but eventually the AVM will hit its end of life.  You want to migrate at your earliest convenience rather than procrastinating and allowing pressure to build.

It’s also important at this point to abandon the use of the term “DM repository”. DM was used to differentiate from the AVM.  At this point there is only one repository.  The “core repository” is much more descriptive of the architecture going forward.  As Jeff points out in his blog and as I will elaborate here, there are differences in the the AVM and the core repository in terms of features. That said, features and use cases should not be confused. The core repository is every bit as capable of providing a platform for Web content management use cases as the AVM.

At Rivet Logic we have made a significant investment in a Web content and experience management application for Alfresco that we call Crafter Rivet. Crafter Rivet is a 100% open source,  full featured environment that powers 100s of websites and has over 40 man years of development effort invested in it. Initially Crafter Rivet’s authoring and management capability was based on the AVM. When Alfresco made the decision to direct the full force of its innovation on the core repository we knew it was time to convert. Crafter Rivet 2.0 is now based on the core repository, Solr search and Activiti workflow for Alfresco 4 CE and EE.  Making the switch from the AVM to the core repository required a deep examination of our use cases and the features we would have on hand within the core repository.

I thought it would be helpful to share some of that thinking.  Today we’ll look at the differences in these repositories and tomorrow, in a second post we’ll discuss the actual Web CMS use cases that need to be addressed and how we addressed them in Crafter Rivet. Let’s explore!

Unique Features of the AVM

The first thing I want to do is address the question of what the AVM can do that the core repository cannot.  Because we’re comparing repository to repository we’re going to discuss features and not use cases. Note that for simplicity, on occasion we’ll collapse the features of the AVM repository, supporting UI and supporting integration into the single term AVM.  It’s also fair to note that what we’ll discuss here are the aspects of the AVM which exposed through the associated user interface and thus applicable to customer engagements.  There are features/capabilities of the AVM repository that were not fully exposed by the UI.

Sandboxing (more accurately, layering)

Sandboxing is the ability for a user to work on any asset or group of assets within the site without interfering with another user’s work.  For example, a user could modify and even break a template or CSS file and no one would know until and unless the user promoted that broken asset out of their work area.  For non-technical readers, a sandbox is best described as your own copy of the website.  For the technical readers, you can think of a sandbox as a long running transaction — in source control terminology, a sandbox is like a checked out branch of the site.

Sandboxing is a high order feature born out a lower level feature called “layering.” The AVM is constructed of lightweight stores.  Stores can be layered one on top of the other.  The store on the top appears as though it contains the assets within the store layered below it.  If a user places an changed in the top store it appears to over-write the asset at the same path below it.  If we have a base store we’ll call “the stage area” and then a store for each user; say Bob and Alice,  and both Bob and Alice’s stores are layered on top of  the staging area you can see how we create the concept of sandbox.  Alice can see everything in the staging area, but nothing in Bob’s store. Alice’s work is in a “sandbox.”  If Alice pushes her change from her store to the staging area, Bob sees it immediately as if it were in his store.

Multi-Asset Versioning

In the AVM, versions are taken at the store level not per asset. When you place a change in the store, a lightweight version is created for all assets at that moment. Because of this, it is possible to know the state of every object at a point in time relative to a given object. For Web CMS applications, we deal with compound assets all the time.  Consider a webpage. A webpage is not typically a single file asset,  it’s a document that points to other documents: components, images, css, javascript, etc.  When you version a page, you generally intend that entire collection of assets to be versioned along with the page.

The core repository manages each individual asset’s version history independently.  If page X points to component Y there is no innate support within the version system to know what version of Y was in use at any given point of X’s history.

Snapshots / Comparison and Rollback

In the AVM you can label a version and compare one version to another. It is easy to see what files have changed.  Because of this it is possible to create “diffs” from one version to another.  Once you have a diff you can roll back entire check-ins very easily.

Content Forms (Now supported in core)

The AVM user interface used to support a forms capability that was not available for the core repository. The forms engine made it simple to create content capture interfaces through mere configuration.  Today the core repository has a forms capability that is more powerful than what was provided for in the AVM user interface.

Content Deployment (Now supported in core)

An AVM project could be configured with remote content receivers.  There was out-of-the-box support for repository to file system deployment (FSR) and repository to repository deployment (ASR).  Today the core repository provides two deployment mechanisms; Transfer Service and Channel Publishing framework, which combined now exceed the capabilities of the AVM content deployment framework.

Unique Features of the Core Repository

Now lets look at what the core repository has going for it that the AVM repository and supporting UI never got around to implementing.  Again we’ll look at features rather than use cases.

Rules Support

The core repository allows you to attach rules to a folder that execute based on lifecycle events and configurable conditions.  This is extremely powerful and its a feature that was sorely absent in AVM.

Stronger Modeling Support

Both repositories allow us to create types and (more commonly) aspects which contain metadata.  However the core repository allows for associations at the modeling layer. In the AVM, associations were kept only as paths within files.  This turns out to be fine for content delivery but bad for managing the content due to move and rename operations because of the unbounded number of updates you may have to perform. Associations in files also makes it difficult for supporting user experience features in your content management platform.   Users expect a platform to understand it’s assets and how they relate to one another in a way that can be quickly accessed through query.  The core repository can do this innately through it’s use of associations.  The AVM cannot.

Strong Transformation and Metadata Extraction

The transformation and metadata extraction frameworks integrated with the core repository greatly exceed the capabilities of the those integrated with the AVM.  The AVM is only integrated with an XML metadata extraction and transformation. The core repository on the other hand has integrated support for all kinds of metadata extraction and transformation including Microsoft Office documents, images, PDF and many many more.

Powerful, Fine-grained Permissions

The core repository gives us the flexibility to create and manage user access to content in a way that best fits an individual engagement through the use of ACLs (Access Control Lists.)  While the AVM was based on a similar scheme under the hood, these were never exposed through the UI and thus it were not practical to deploy on engagements.  Out-of-the-box AVM exposed a few roles that could be applied broadly to the entire site in each sandbox.

API support

The core repository has much better remote API support.  The core repository supports CMIS, webscripts, and RAAr.  AVM only supports a remote API based on RMI.

Workflow Engine

The core repository has 3 workflow engines integrated with it: Simple, JBPM, and Activiti. Activiti is based on standards and has parity with JBPM, but incorporates a far better management console. The AVM provides workflows based on JBPM integration only.

Search

Full text search support is based on indexing. You index a store. In the AVM universe every web project was made up of many (layered) stores.  It was not practical to index every store.  Although you can configure individual stores for indexing, if every author in the system wants to be able to search their sandbox, you will hit obvious limitations to the approach. The core repository content has only one store which is constantly tracked by a search index which means that search is very current with work in progress.  Alfresco 4 has introduced Solr as one of its search subsystems.  Solr provides capabilities that greatly exceed Lucene, the indexing engine used by the AVM.

Integrations

The core repository has many integrations that allow users to interact with content on their own terms, be it email, WEBDAV, FTP, shared drive, Sharepoint and so on.  With exception to the filesystem projections, these are simply not made available in the AVM.

Native support in the UI and APIs for taxonomies, folksonomies and collaboration features

The core repository has repository service and UI support for

  • Hierarchical taxonomies
  • Tags
  • Comments
  • Content lists

Making Sense of the Differences

By this time you should have a pretty good idea of how the repositories compare from a feature perspective.  Two observations are obvious:

  • The core repository has a had the benefit of deep continuous innovation.
  • The AVM has certain features intended for WCM use cases that will need to be addressed in a solution that leverages the core repository.

Join us tomorrow for a blog post that will demonstrate the use cases that these features were intended to cover and how we addressed all the Web CMS use cases with Crafter Rivet using and capturing the full power and innovation of Alfresco 4 and the core repository.

Click here to learn more about Crafter Rivet

Click here to sign up for a our webinar “Crafter Rivet – The WEM Solution for Alfresco 4″ on April 12th at 1pm Eastern for a further discussion and demos!