Quantcast
Channel: StackStorm
Viewing all 302 articles
Browse latest View live

Summary – Brocade Workflow Composer at NFD 12

$
0
0

August 23, 2016
by Yousuf Hasan

We had a great afternoon with the NFD team as we walked them through our vision, design principles and demos of the Brocade Workflow Composer. We had productive discussions and got good, candid feedback. Caffeine and sugar kept everyone’s critical abilities to the max. There’s 4 separate video sessions…here’s what happened:

Kick-off, Brocade Automation Vision, Overview

Jason kicked off the event by talking about the fact that Brocade has been focused on delivering agility and automation to customers for years…storage area networking, VCS Fabrics, etc. Brocade Workflow Composer (BWC) is the latest step in this journey towards agility and automation, where open source and DevOps are key driving factors. Devops methodologies have been applied to applications and compute successfully by Google, Facebook, Amazon and similar webscale companies. BWC supports this culture through open source technologies and community-centric approach but adds network automation to the mix. And BWC has a strong starting point because it’s based on Brocade’s acquisition of StackStorm.

StackStorm is a proven, open source platform and will continue to stay so. BWC will be the Enterprise grade version with Enterprise bells and whistles such as RBAC/LDAP and Brocade Technical Support. Brocade wants to bring DevOps principles to networking. BWC is a micro-services based, off-box automation platform that can automate the network lifecycle, other IT domains such as compute, storage, applications and operations all of which form essential components of a modern IT service. Network lifecycle automation spans automated provisioning, automated validation, automated troubleshooting and automated remediation – components critical in achieving agility and operating efficiency.

StackStorm Fundamentals

Dmitri Zimine, ex-CTO of StackStorm, defines StackStorm as event driven automation in a DevOps manner. Dmitri talked about automation not only for day 0 provisioning of IT services, but also the need to manage day 2 operations such as troubleshooting and remediation. Leaders in the IT industry already automate day 2 operation in-house, such as FBAR from Facebook. However, none are available to the public as open source projects or commercial products.

Dmitri described the founding of StackStorm and the product’s design principles. Sensors are integration points that tie inputs from various domains and allow BWC to automatically react to one or more events of interest and take actions, based on IFTTT (if-this-then-that) programmatic logic. BWC has 700+ community users who are contributing integration packs and workflows to the community delivering nearly 2000 points of integration for others to use (check out StackStorm’s Git repository). These include integrations into public cloud, containers, OpenStack, monitoring services, other automation systems and even integrations to your Tesla. The community-centric angle provides BWC with a rich set of existing cross-domain integrations to build on.

Dmitri showed a cool interactive demo and the BWC Design UI. The interactive used a StackStorm workflow to throw all StackStorm related tweets from NFD12 events to a StackStorm Slack channel to highlight the use of workflows, sensors, actions, and cross-domain integration. The BWC Design UI demonstration showed how users can visually tweak workflows via drag & drop and visual navigation –great for those just getting started in developing workflows.

Brocade Workflow Composer – Part I

Yousuf, Sr, Manager, Product Management for Workflow Composer, added to the StackStorm foundation by focusing on the network automation lifecycle and demoing the automated provisioning of a 3-Stage IP Fabric on 7 VDX switches in less than 15 minutes !

First, the demo builds an IP Fabric with BGP EVPN, following the same design principles that webscale companies such as Facebook and Microsoft use. Once the workflows are tested, they’re error free and can be reused. Manual provisioning of the same IP Fabric can take days (or more if engineers make mistakes).

Second, the ability to automate the entire network lifecycle. Yousuf highlighted the fact that while the demo is focused around IP Fabric provisioning, the key value for customers is in validation, troubleshooting and remediation of datacenter and other networking use cases. The NFD12 bloggers also agreed with this point because while fabric provisioning automation makes a good demo, people don’t build fabrics everyday.

The real value is in event-driven automation where a workflow can be triggered by a sensor watching for network events such as link errors on a port-channel or environmentals going whacko. The sensor uses IFTTT to trigger a troubleshooting workflow to collect information and based on that, initiate a remediation workflow that uses BWC actions to steer traffic off the failing switch. All done without a single human involved.

The audience also highlighted the fact that network engineers lag in the programming/DevOps space and a network centric UI for network lifecycle automation will benefit the networking guys. This is where BWC’s turnkey, customizable workflows come in. Turnkey workflows provide out-of-the-box automation but are customizable to enable network engineers to adapt them to meet unique needs.

Brocade Workflow Composer – Part II

Lindsay Hill, Sr. Product Manager Workflow Composer, pulled it all together by demonstrating how BWC uses StackStorm’s event-driven, cross-domain automation to detect and remediate a critical IT service, informing and involving humans only as needed. The demo automated disk space remediation on a Linux webserver by using Sensu as a monitoring sensor which triggers a BWC workflow when the Linux server runs low on disk. This workflow clears a logs directory on the disk to remediate low disk space issue while the success or failure of this workflow is reported to Slack via ChatOps integration.

To bring all this automation back to helping network operators, Lindsay also talked about automated BGP troubleshooting and remediation workflows which can turn a 3 am call into a 9 am e-mail review.

Lastly, Lindsay emphasized the fact that StackStorm open source project features and community integration packs are critical success factors for Brocade’s automation journey. StackStorm community users and contributors are able to implement event-driven, cross-domain automation. Using StackStorm as the foundation, BWC users are able to apply these principles to networking integrated with enterprise security and the full backing of technical support. BWC and StackStorm are available as RPM and deb packages.

The post Summary – Brocade Workflow Composer at NFD 12 appeared first on StackStorm.


Connecting The Future: Event Driven Automation for Cross Domain Workflows. StackStorm and Brocade Workflow Composer at VmWorld 2016

$
0
0

August 27, 2016
by Chip Copper

I’ve seen it over and over again.

A user puts in a simple request. Nothing elaborate or unusual – just a standard request. It goes into the queue, generates a ticket, and then… nothing.

Days or weeks later, an overworked, tired engineer in the back office operations room finally gets to the request, opens a process document, and begins the laborious task of doing the same thing that he or she has done dozens of times before each time that this type of request comes in. Ok, maybe the IP address or the user name has changed, but the process is the same as it has always been. And it is slow. Very slow. And the engineer may make a mistake. And, quite frankly, it’s frustrating for the user and boring for the engineer.

It’s 2016. We can do better. And many have through workflow automation. And the best news is that you can too!

Networks have, until recently, required a lot of hands-on care to get them set up and reconfigured. Brocade saw the advantage of having an automated workflow doing the job of walking through common networking workflows, and so recently acquired StackStorm. Yeah, that StackStorm. If you have any experience with DevOps and event driven automation of workflows, you will recognize StackStorm as the leader in open source event driven automation. StackStorm has a rapidly growing fan base—including Netflix, who has leveraged StackStorm in their remediation platform called Winston. The community has leveraged the StackStorm platform to automate all kinds of workflows and is showing strong support for building even more integrations into other tools and platforms including (but not limited to as they say on TV): VMware, Cassandra, Microsoft Azure, Docker, Ansible, Github, Jira, Kubernetics, Google, Linux, generic email, mqtt, OpenStack, Puppet, Splunk, Yammer, MS Windows…. (My fingers are getting tired. Go read the entire list at http://github.com/stackstorm/st2contrib).

For customers wanting enterprise level support and advanced features such as LDAP , User Authentication, and next gen visual workflow design -check out Brocade Workflow Composer(BWC)—the commercial version of StackStorm. Looking for network automation? Tap into network automation suites developed for BWC.

Begin automating your workflows today. Join the community and install StackStorm. Want to see a demo of StackStorm and chat about how to apply event driven automation to your networking environment? Check out the StackStorm demo at VmWorld. I’ll be in the Connecting the Future zone of the Brocade booth-Booth #935.

Or do you just want to wait. And wait. And wait. And wait….

The post Connecting The Future: Event Driven Automation for Cross Domain Workflows. StackStorm and Brocade Workflow Composer at VmWorld 2016 appeared first on StackStorm.

StackStorm Enterprise is Back!

$
0
0

September 1, 2016
by Lindsay Hill

StackStorm Enterprise is back, and it’s now Brocade Workflow Composer. We’ve just shipped version 2.0. The platform has had a look & feel update, the usual round of bugfixes and enhancements, and we’re introducing “Network Automation Suites” for our networking friends. More on those in a minute.

We know that you’ve been asking about the future of StackStorm, where the project is going, when you can buy StackStorm Enterprise again, etc. Well, today we should be able to answer all of those questions.

Our community members will have noticed we’ve been a bit quiet recently. That’s because we’ve been working super-hard on getting this release out the door. There’s a lot of work going on under the hood to make it possible, so we’re pretty happy to see it shipped.

New Colors, New Names and and the Return of Enterprise

StackStorm Enterprise is now Brocade Workflow Composer (BWC). What was known as StackStorm Community Edition is now just StackStorm. That will remain Open Source. The key differences are that BWC includes FlowWorkflow Designer, Professional Support, and now Network Automation Suites.

We’ve updated the colors for StackStorm, and we’ve “Brocade-ised” the colors for the BWC web interface. Here’s a sample of the new look – what do you think?

StackStorm Web Interface StackStorm Web Interface

BWC Web Interface BWC Web Interface

“Flow” has a new name, and a new look – it’s now “Workflow Designer”:

Workflow Designer Workflow Designer

The BWC docs have a different home to the StackStorm docs. The StackStorm documentation remains at docs.stackstorm.com. The BWC docs include BWC-specific information, including details about the Network Automation Suites. They’re at bwc-docs.brocade.com, and yes, they too have a new look and feel:
BWC Docs

BWC is an add-on set of packages on top of StackStorm. To install it, you’ll need a license key. You can quickly get an evaluation key at brocade.com/bwc. Then you just need to follow the BWC install guide. BWC is available for purchase through Brocade and its partners. Thanks to those of you who have put up with us rolling over trial licenses over the last few months.

Network Automation Suites

We’re introducing the concept of “Automation Suites” with this release. These are additional packages installed on top of the BWC platform, that address a specific network automation use-case. The first Automation Suite we’re releasing now is the “IP Fabric Automation Suite.” This targets Brocade IP Fabrics, providing integration packs and additional services needed to provision and manage IP Fabrics. It’s more than just a traditional StackStorm integration pack – it provides things like an inventory service, and additional set of CLI commands.

We think this is a really good concept – by separating the platform from the specific vertical use-case, the platform remains “pure” and can be used by anyone, but we have a way of deploying the integrations and services needed for a specific use-case. You can use BWC without needing to install the networking components.

Expect to see additions to this suite, and new suites, over the coming months. We also think this approach could be used by other people who want to use BWC as an automation platform for other environments. We learnt a lot in making these changes, so get in touch if you want to hear more about it.

It’s not just Re-Branding

We didn’t just change the colors and logos. We’re continuing to make StackStorm better, and we won’t be stopping any time soon. Here’s some of the stuff we’ve done since v1.6:

Platform Improvements

  • Upgrade pip and virtualenv libraries used by StackStorm pack virtual environments to the latest versions (8.1.2 and 15.0.3)
  • Jinja filter changes, including new custom functions to_json_string, to_yaml_string and to_human_time_from_seconds
  • ChatOps response includes execution time by default
  • Allow users to cancel multiple executions with a single st2 execution cancel command
  • st2ctl reload now runs –register-rules
  • packs.load default timeout is now 100s
  • packs.uninstall will now warn you if there any rules referencing the pack being uninstalled
  • Python runner actions and sensors will still load even if the module name clashes with another module in PYTHONPATH

Bugfixes

  • Fix validation of action parameter type attribute. Previously we allowed registration of any string value, and it would fail when executed. Now it will fail at registration.
  • Fixed bug when Jinja templates with filters in parameters weren’t rendered correctly.
  • Fixed disabling and enabling of sensor through API and CLI
  • Fixed HTTP runner so it works with newer versions of requests library (>= 2.11.0)

Thanks to everyone who contributed with bug reports and code.

Full details are, as always, in the Changelog

Getting v2.0

New packages are now in the stable repositories. If you’re already running StackStorm > v1.4, you can upgrade using yum or apt.

As always, we strongly recommend that you treat your automation code as true code – use source control systems, use configuration management systems. You break it, you get to keep the pieces.

The Future

Getting our first “Brocade” release out the door was a major milestone. We had to figure out a few things to get Brocade and StackStorm systems aligned. Now we’ve done that, it’s full steam ahead. More/better/faster releases coming!

The post StackStorm Enterprise is Back! appeared first on StackStorm.

How to Troubleshoot a Rule

$
0
0

Sep 20, 2016
by Dmitri Zimine

I set up a sensor to watch for a trigger (trigger represents an external event; sensor will fire a trigger-instance of the trigger type when the event is detected). I created a rule: if the trigger happens, and matches the criteria, it should fire an action. I see that event had happened. I expected the actions to fire. But it didn’t happen. Where did it break?

This is a long read, and may look complicated. But really, it’s just three debugging steps. And it’s long because I refuse to write briefly, drop bunch of hints on the way and get you distracted. But as they say in math, the thicker the math book the faster it reads. Brace yourself.

In the example below, I’ll be showing you how we debugged our Twitter automation that scans tweets for mentions and posts it to slack. A pretty good way to keep track on who is trash talking about us! The debugging “runbook”is generic and applies to troubleshooting other rules just fine.

First, let’s look at the trigger chain and review how it works.

trigger-rule-action

An event happens. Sensors captures an event and emits… what? Previously we said for brevity, “emits trigger”. Now it’s time to get nuanced. It emits “trigger-instance”. WTF? Let’s see. If a tweet is an event, how many of them do we have? billions! and they are all of the same – what? type! They are tweets! So, a tweet is an event type, while each individual tweets are instances of “tweet” event type. Good so far? Ok, now twitter.matched_tweet is a trigger that corresponds to a tweet event type. And each individual tweet, an instance of “tweet” event type, is represented by “trigger-instance”. So, simply: trigger is a type, trigger-instance is an instance of this type. Therefore, when an actual tweet goes off, the sensor will emit a trigger-instance. Not clear? Read it again. Rinse. Spit. Continue. Proceed when it’s clear. Send us a note to break from infinite loop.

An event happens.
Sensor captures the event, and emits a trigger-instance.
Trigger-instance goes to a message bus, and hits the rule engine.
Rule engine checks: is trigger instance is of interest to any rule? If so, does it match the rule criteria?
The act of matching the trigger-instance against the rule is called “rule-enforcement”.
If the rule matches, it schedules an action execution. Execution id is created, and an execution request goes back into the message bus to find an action runner that picks it up to run it, as the name implies.

Step 0. Did the external event actually happen?

Check outside of StackStorm. In this case, I go to Twitter and see that tweet.

Step 1. Sensor configured and working?

$ st2 sensor list
+-----------------------------+----------+-----------------------------+---------+
| ref                         | pack     | description                 | enabled |
+-----------------------------+----------+-----------------------------+---------+
| ...
| twitter.TwitterSearchSensor | twitter  | Sensor which monitors       | True    |
| ...

st2 trigger list --pack=twitter
+-----------------------+---------+--------------------------------------+
| ref                   | pack    | description                          |
+-----------------------+---------+--------------------------------------+
| twitter.matched_tweet | twitter | Trigger which represents a matching  |
|                       |         | tweet                                |
+-----------------------+---------+--------------------------------------+

Remember that if you reconfigure a sensor (using config files or new config options), you must reload it for the configurations to take effect: st2ctl reload-component st2sensorcontainer. It’s only for sensors: actions pick up any change or any new configuration without reload.

Step 2. Did the sensor emit the trigger-instance for an event?

# st2 trigger-instance list
.... loads of output....

Oh no! This output is SO NOISY! How can I possibly find anything? How to find the needle in the haystack here? Look at the rule to check the trigger type, and filter by it. It’s twitter.matched_tweet, so:

st2 trigger-instance list --trigger=twitter.matched_tweet
+--------------------------+-----------------------+---------------------------+-----------+
| id                       | trigger               | occurrence_time           | status    |
+--------------------------+-----------------------+---------------------------+-----------+
| 57ae23b0d805641b8ed11de1 | twitter.matched_tweet | Fri, 12 Aug 2016 19:29:52 | processed |
|                          |                       | UTC                       |           |
| 57ae2ce2d805641b8ed12543 | twitter.matched_tweet | Fri, 12 Aug 2016 20:09:06 | processed |
|                          |                       | UTC                       |           |
|...
| 57ae834bd805641b8ed16c5d | twitter.matched_tweet | Sat, 13 Aug 2016 02:17:47 | processed |
|                          |                       | UTC                       |           |
+--------------------------+-----------------------+---------------------------+-----------+

If the trigger-instance for the event is not there, something is wrong with the sensor. It may not have captured it, or something else has gone wrong. Check the logs at /var/log/st2/st2sensorcontainer.log and debug the sensor.

If the trigger-instance IS here, we move on to the rule.

If you’re not sure, use st2 trigger-instance
Hint: form your ideal CLI output with combinations of -a and -y or -j parameters. Limit the number of records with -n E.g.

# st2 trigger-instance list -a "id" "occurence_type" "payload" -y --trigger=twitter.matched_tweet -n 5
-   id: 57ae6724d805641b8ed155c3
    payload:
        created_at: Sat Aug 13 00:19:01 +0000 2016
        favorite_count: 0
        id: 764254896379932672
        lang: en
        place: null
        retweet_count: 0
        text: '@jiangu In that case, @Stack_Storm presentation at @Brocade. #NFD12'
        url: https://twitter.com/ecbanks/status/764254896379932672
        user:
            description: 'PacketPushers dot net co-founder. Podcaster & writer covering
                data center design & network engineering. I interview nerds so you
                don''t have to. CCIE #20655.'
            location: New Hampshire
            name: Ethan Banks
         screen_name: ecbanks
...

Step 3. Did the rule get enforced, matched, and created execution?

Scenario 1: NO.

It did not get enforced. So the trigger-instance didn’t reach the rule engine. Go to Step 2, triple-check that the trigger-instance got emitted, and if it did, dive into the logs (run st2sensorcontainer with DEBUG) and troubleshoot at RabbitMQ level.

Scenario 2: YES

It does get enforced but didn’t create execution. For example:

$ st2 rule-enforcement list --rule=tweeter.relay_tweet_to_slack
+--------------------------+------------------+---------------------+--------------+------------------+
| id                       | rule.ref         | trigger_instance_id | execution_id | enforced_at      |
+--------------------------+------------------+---------------------+--------------+------------------+
| 57ae7037d805641b8ed15d18 | tweeter.relay_tw | 57ae7037d805641b8ed |              | Sat, 13 Aug 2016 |
|                          | eet_to_slack     | 15d16               |              | 00:56:23 UTC     |
+--------------------------+------------------+---------------------+--------------+------------------+

O-oh…

If “execution_id” is empty, it’s TROUBLE. Either the criteria didn’t match, or the Jinja template is messed up. Fire up st2-rule-tester, and see, will this trigger instance match this rule? All input is conveniently at your disposal – rule.ref and trigger -nstance_id is in the above output of rule enforcement list.

HINT: when copying IDs from table output kills you, remember -y option, it may be handy!

st2 rule-enforcement list --rule=tweeter.relay_tweet_to_slack -y
-   enforced_at: '2016-08-13T00:56:23.576716Z'
    id: 57ae7037d805641b8ed15d18
    rule:
        ref: tweeter.relay_tweet_to_slack
    trigger_instance_id: 57ae7037d805641b8ed15d16
-   enforced_at: '2016-08-13T02:17:47.443764Z'
    execution_id: 57ae834bd805641b8ed16c60
    id: 57ae834bd805641b8ed16c61
    rule:
        ref: tweeter.relay_tweet_to_slack
    trigger_instance_id: 57ae834bd805641b8ed16c5d

Here we go, testing the rule!

st2-rule-tester --trigger-instance-id=57ae7037d805641b8ed15d16 --rule-ref=tweeter.relay_tweet_to_slack
2016-08-13 01:06:52,158 INFO [-] Connecting to database "st2" @ "0.0.0.0:27017" as user "None".
2016-08-13 01:06:52,224 INFO [-] Validating rule tweeter.relay_tweet_to_slack for matched_tweet.
2016-08-13 01:06:52,224 INFO [-] 1 rule(s) found to enforce for matched_tweet.
2016-08-13 01:06:52,232 ERROR [-] Failed to resolve parameters
    Original error : 'dict object' has no attribute 'errorHereForSure'
2016-08-13 01:06:52,233 INFO [-] === RULE DOES NOT MATCH ===

Aha! I’ve messed up the Jinga template. To fix it, I edit and update the rule. Before I update, I may want to check it. Note that st2-rule-testercan be used in both “online” mode, working against real trigger-instance and rule objects in the system, or “offline mode”, using rules from file, and trigger-instance captured to the file, or in any combination. Like this – here I edited the rule definition in a file, and before updating it, trying it with st2-rule-tester:

$ st2-rule-tester --trigger-instance-id=57ae7037d805641b8ed15d16 --rule=relay_tweet_to_slack.yaml
2016-08-13 01:14:07,084 INFO [-] Connecting to database "st2" @ "0.0.0.0:27017" as user "None".
2016-08-13 01:14:07,142 INFO [-] Validating rule tweeter.relay_tweet_to_slack for matched_tweet.
2016-08-13 01:14:07,142 INFO [-] 1 rule(s) found to enforce for matched_tweet.
2016-08-13 01:14:07,150 INFO [-] Action parameters resolved to:
2016-08-13 01:14:07,150 INFO [-]    message: A tweet from @dzimine:\nhttps://twitter.com/dzimine/status/764264543321100288
2016-08-13 01:14:07,150 INFO [-]    channel: #twitter-relay
2016-08-13 01:14:07,150 INFO [-] === RULE MATCHES ===
-

It works! You can see what kind of action parameters I’m gonna send to my action from this particular trigger-instance.

Ok, now st2 rule update tweeter.relay_tweet_to_slack relay_tweet_to_slack.yaml, rule is fixed.

If the external event is too important to miss, but now it has already happened and not gonna happen again…you may want to re-fire your automation for it, by re-emitting the trigger-instance, now that the rule is fixed:

st2 trigger-instance re-emit 57ae7037d805641b8ed15d16
Trigger instance 57ae7037d805641b8ed15d16 succesfully re-sent.

Checking…Look, now same the trigger-instance appears twice, and the re-emitted one triggered the desired action!

st2 rule-enforcement list --rule=tweeter.relay_tweet_to_slack
+--------------------------+----------------------+----------------------+----------------------+----------------------+
| id                       | rule.ref             | trigger_instance_id  | execution_id         | enforced_at          |
+--------------------------+----------------------+----------------------+----------------------+----------------------+
| 57ae7037d805641b8ed15d18 | tweeter.relay_tweet_ | 57ae7037d805641b8ed1 |                      | Sat, 13 Aug 2016     |
|                          | to_slack             | 5d16                 |                      | 00:56:23 UTC         |
| 57ae834bd805641b8ed16c61 | tweeter.relay_tweet_ | 57ae834bd805641b8ed1 | 57ae834bd805641b8ed1 | Sat, 13 Aug 2016     |
|                          | to_slack             | 6c5d                 | 6c60                 | 02:17:47 UTC         |
+--------------------------+----------------------+----------------------+----------------------+----------------------+

Conclusion

We do have this procedure documented in Troubleshooting section of our docs. But we know that we’re short on tutorials, and we’re working hard to fix it.

Please tell us here, or on Slack, what other areas of StackStorm you’ve got questions about, and where you want help. Better yet, write it! We will be happy to post your tutorials on our blog, promote them, or make part of our documentation.

Happy automation!

The post How to Troubleshoot a Rule appeared first on StackStorm.

Minor update: v2.0.1

$
0
0

Sep 30, 2016
by Lindsay Hill

Hey folks, StackStorm v2.0.1 has been released, with a few small fixes and enhancements. We’ve also had some great code contributions recently, with a new integration for Datadog, and improvements to our Github and Nagios packs.

Read on for details, plus a few hints on what we’re up to next.

StackStorm v2.0.1

Mostly fixes and cleanups in this release:

  • Fixed the problem where st2actionrunner files could disappear after a while. Tracked it down to a problem with our logrotate config, of all things.
  • Using --attr with the st2 execution get command now correctly works with child properties of the result and trigger_instance dictionary.
  • The st2 trace list command and associated API endpoint now list traces sorted by start_timestamp in descending order by default. You can also specify sort order by adding ?sort_desc=True|False query parameters, or by passing --sort=asc|desc to the st2 trace list CLI command.
  • Action default parameter values now support Jinja template notation for parameters of type object.
  • st2 key delete properly supports --user/-u. It was supposed to. Now it actually does.
  • Fixed a bug in st2web where it would cache old parameter entries.

Updates are available via apt or yum. You can insert your own broken record comment here about “backup first, yada yada…”

Community Highlights

  • New integration with Datadog, with a huge set of supported actions. Thanks @sanecz!
  • Updated Github pack with support for managing releases and deployments, courtesy of jjm.
  • AWS pack update to allow processing more SQS messages.
  • HPE-ICSP bugfixes from Paul Mulvihill

Want to see the future?

The cool thing about StackStorm code development is being able to see the WIP PRs, before they get merged into the main codebase. Here’s some interesting pieces coming up:

  • Extend auth to secure ChatOps: The much-loved @anthonypjshaw just couldn’t wait for our planned works around improving security controls with ChatOps, so he’s jump-started things, putting together a great PR.
  • Pluggable runners: Our very own @BigMStone is doing work to let us treat runners as plugins. This should make it easier to add new runner types in future. Now I can finally write my actions in Visual Basic.
  • Pack Management: We want to make it easier to discover and share packs. This PR will lay the groundwork for that.

Watch out for them to get merged into future releases. If you’re really keen, try them out on your development systems, and pitch in with code & feedback!

The post Minor update: v2.0.1 appeared first on StackStorm.

OpenStack Summit Barcelona Preview: Resiliency, High Availability, Adaptability, and Self-Healing with StackStorm Event-Driven Automation

$
0
0

StackStorm at OpenStack Summit Highlights:

openstackbarcelona

We’re less than two weeks away from OpenStack Summit in Barcelona. The StackStorm team is looking forward to seeing old friends and making new ones as we gather to share best practices and learn innovative approaches to delivering applications in the cloud.

Designing, deploying, and managing OpenStack, with an ever increasing number of components, services, and tools to choose from, is a complex process. With event-driven automation, StackStorm provides an elegant solution for greater resiliency, adaptability and auto-remediation of your OpenStack deployments.

You may have heard that event driven automation helps Facebook save 16,000 person-hours in operations each day. Do you ever wonder “How can I can accomplish this with my OpenStack cluster, and what tools will help me do it?” The closest answer you will find is StackStorm – an open-source, Apache 2.0 Licensed, event driven automation platform. StackStorm is built on the same premises as Facebook’s FBAR and used by many to automate their growing infrastructure, private and public cloud deployments, network infrastructure, security, and operations. Stop by the booth (you’ll find us in the Brocade booth # ) or sit in on the StackStorm theater presentation to learn about automatic troubleshooting and remediation of OpenStack infrastructure, minimizing downtime, and improving time to resolution. 
We’ll also be discussing how Neutron and StackStorm work together to dynamically provision L2 networks on demand for multi-tenant isolation inside OpenStack.

Our friends at Mirantis will also be sharing best practices around StackStorm. During their session “Sleep Better at Night: OpenStack Cloud Auto­-Healing “, they will be discussing how StackStorm event-driven automation, running on their 1000 node OpenStack cluster, helps to eliminate headaches, expedite incident response by “assisting” engineers with troubleshooting, and let’s their engineers sleep at night. They’ll share real life examples of how StackStorm can take care of your cloud when you sleep: not only basic operations like restarting nova-api, cleaning ceilometer logs, but also complex things like rebuilding rabbitmq cluster or fixing galera replication.

Mistral – the OpenStack workflow service – gained popularity and usage within the OpenStack community. We have been active contributors to Mistral from the outset of the project – it is the drivetrain of StackStorm event-driven auto-remediation platform. In Newton release, Mistral’s focus had been on resilience, high availability, and usability. We are doubling down on our support of Mistral in Ocata. You can meet Dmitri Zimine and Winson Chan, active members of the Mistral project at the Design Summit. This time around, in addition to participating in defining Mistral direction, we plan to explain to the OpenStack developer community how Mistal and StackStorm complement each other, and how StackStorm benefits the OpenStack community.

The topics above just scratch the surface of StackStorm capabilities. Stop by to see us and let’s unleash your imagination on how you can include StackStorm in your overall OpenStack automation strategy. If you know us—you know we’re passionate about event-driven automation. And if you are not already, you will be… once you see all the cool things StackStorm can do.

We look forward to seeing you in Barcelona!

The post OpenStack Summit Barcelona Preview: Resiliency, High Availability, Adaptability, and Self-Healing with StackStorm Event-Driven Automation appeared first on StackStorm.

Auto-Remediation Meetup: Stories & Demos by Netflix and Mirantis

$
0
0

Hello fellow automators and welcome back to the series of meetups, now that the summer is almost gone, and we are starting up the new season.

We got two speakers with really interesting topics:

Mirantis, who will share how they auto-remediate 1,000 node OpenStack cloud at Symantec using StackStorm platform.

Netflix, who will present (and demo!) Winston Studio – you might have read about it on Netflix tech blog, now you will hear from the source and actually see it.

WHEN: Thursday, October 20, 6:30 PM – 9:00 PM

WHERE: Theater @ Brocade, 130 Hoger Way, San Jose, CA

Click here for registration and more info.

The post Auto-Remediation Meetup: Stories & Demos by Netflix and Mirantis appeared first on StackStorm.

Auto-Remediation with StackStorm & Splunk

$
0
0

Oct 21, 2016
by Siddharth Krishna

Splunk is a great tool for collecting and analyzing log data. StackStorm is a great tool for automated event-driven remediation. So what happens when we stick them together? Here’s how to use Splunk to collect syslog data and trigger event-based network remediation workflows using StackStorm!

Syslog as an event source

Syslog for events and errors from the network devices can be tapped to trigger troubleshooting and auto-remediation actions on those devices. For example, if a ‘link down’ event occurs, the syslog message can be used to auto-trigger an action that would log into the device and try to bring the interface back up. In parallel, an IT ticket can also be auto-created with the relevant interface details. Notification of the event and the automated workflow can optionally be sent to a Slack channel (ChatOps).

One option is to have StackStorm itself act as the syslog server and run a sensor that polls the log file to match specific log strings. This method, although workable, has its performance limitations, and it doesn’t give us a nice tool for searching historical logs. Instead we can use something like Splunk, which is a log aggregation and analysis system. Splunk includes alert functionality – we can filter syslog messages, extract relevant fields, and trigger actions such as making a webhook request to StackStorm.

Here’s how to configure an auto-remediation workflow using Brocade VDX switches, Splunk and StackStorm:

NB this guide assumes that you have a working StackStorm/BWC system, and a Splunk server.

VDX Configuration

Configure all VDX switches to send syslog messages to the Splunk server:

logging syslog-server <splunk-ip-address> use-vrf mgmt-vrf

Setting up Splunk

We’re using the default Splunk Search & Reporting App here. You could also use the Brocade Splunk App.

If your server does not currently accept syslog input, add it by going to:

Settings -> Data -> Data Inputs -> UDP -> New

Input UDP port 514 (default syslog port) and Source Type as ‘syslog’

Refer here for more on configuring data inputs on Splunk.

Splunk should now start displaying live syslog events from the switches in its search result.

Search -> Data Summary -> Search Type: syslog

Use the output of show logging raslog on a switch to identify an event or error of interest from its log string. In this case, we want to dynamically detect and act upon a link flap for which we need to tap syslog [NSM-1003], 48201, SW/0 | Active | DCE, INFO, LEAF2,  Interface TenGigabitEthernet 102/0/48 is link down.

View this code snippet on GitHub.

On splunk, create a search criteria to filter out the event. Here, we use the following:

sourcetype=syslog NSM-1003 process=raslogd | fields + host, interface, syslog_text

Stricter search criteria are recommended for greater filter accuracy. Once your search pattern is set up correctly, save it as a Splunk Alert with alert type as “Real-time” in order to capture the event live.

Save As -> Alert | Alert-type: Real-time, Trigger alert when: Per-Result

Splunk provides multiple options for Trigger Actions – Run a script, Send an Email, Webhook. You can run a script that makes a cURL call to StackStorm webhook URL or, for simplicity, use Splunk’s webhook option. StackStorm’s custom webhook URL is:

https://{STACK_STORM_HOSTNAME}/api/v1/webhooks/splunk_link_flap?st2-api-key=XXXXXXXXXXXXXXXXXXXXXXXX

For successful action execution, relevant fields or parameters need to be correctly extracted from the event log message and passed via the webhook call to StackStorm. For example, to be able to auto-remediate a ‘link down’ event i.e. do a “shut; not shut” on the switch, the switch’s IP address and an interface name are must. Splunk’s field extraction capability is useful in auto-generating regexp to pull out the field values from the log string. These “field:value” pairs are then passed in the webhook JSON payload. More on field extractions can be found here.

StackStorm/BWC Configuration

A custom webhook rule mapping the webhook trigger to the action workflow is created within StackStorm/BWC.

The custom webhook rule for ‘link flap’ defines the following:

  • Webhook trigger URL: “splunk_link_flap”, complete webhook URL to be configured on Splunk is “https://bwc/api/v1/webhooks/splunk_link_flap”
  • Action Reference: Name of the workflow to be executed when the webhook is triggered
  • Action Parameters (originally extracted by Splunk and passed via webhook):
    • Host: IP Address of the switch/device
    • Interface: Interface that went down
View this code snippet on GitHub.

Rule Definition:

splunk_link_flap.yaml
View this code snippet on GitHub.

Rule Details:

View this code snippet on GitHub.

When the ‘link down’ syslog is detected by Splunk, and the webhook to StackStorm called, the trigger instance payload contains the values for the various Splunk fields and parameters. These include the host IP address and the interface name (as per configured field extractions) along with other standard ones such as search link, search ID, raw log message etc. See this example below:

View this code snippet on GitHub.

These field values can now be directly accessed from the StackStorm webhook rule using {{trigger.body.result.xxx}} e.g. {{trigger.body.result.host}}.

Upon successful enforcement, the rule executes the auto-remediation workflow with execution ID as shown below:

View this code snippet on GitHub.

In this example, the workflow for link flap remediation does the following:

  1. Notify “link down” event on Slack with the Host IP address and Interface name
  2. Pull configuration details for the given interface from the switch and post it to the Slack channel
  3. Try to bring the interface back up by executing “shut; no shut” on the switch
  4. Pull interface details show interface detail from the switch and post the output to Slack
  5. Create a Zendesk IT ticket for the event occurrence and attach relevant logs
View this code snippet on GitHub.

Slack Channel

All code examples used here are available in the StackStorm st2incubator repo on GitHub.

This is just one example of a syslog-driven auto-remediation workflow. Workflows can be custom-built with actions and integrations for your environment. Use Jira instead of Zendesk? No problem! Quickly modify the workflow to use Jira instead! Use Hipchat? No problem, modify your st2chatops2 setup to use that! Easy.

What are you doing with event-driven remediation with StackStorm? Jump into our community and let us know!

The post Auto-Remediation with StackStorm & Splunk appeared first on StackStorm.


Auto-Remediation with StackStorm & Splunk

$
0
0

Oct 21, 2016
by Siddharth Krishna

Splunk is a great tool for collecting and analyzing log data. StackStorm is a great tool for automated event-driven remediation. So what happens when we stick them together? Here’s how to use Splunk to collect syslog data and trigger event-based network remediation workflows using StackStorm!

Syslog as an event source

Syslog for events and errors from the network devices can be tapped to trigger troubleshooting and auto-remediation actions on those devices. For example, if a ‘link down’ event occurs, the syslog message can be used to auto-trigger an action that would log into the device and try to bring the interface back up. In parallel, an IT ticket can also be auto-created with the relevant interface details. Notification of the event and the automated workflow can optionally be sent to a Slack channel (ChatOps).

One option is to have StackStorm itself act as the syslog server and run a sensor that polls the log file to match specific log strings. This method, although workable, has its performance limitations, and it doesn’t give us a nice tool for searching historical logs. Instead we can use something like Splunk, which is a log aggregation and analysis system. Splunk includes alert functionality – we can filter syslog messages, extract relevant fields, and trigger actions such as making a webhook request to StackStorm.

Here’s how to configure an auto-remediation workflow using Brocade VDX switches, Splunk and StackStorm:

NB this guide assumes that you have a working StackStorm/BWC system, and a Splunk server.

VDX Configuration

Configure all VDX switches to send syslog messages to the Splunk server:

logging syslog-server <splunk-ip-address> use-vrf mgmt-vrf

Setting up Splunk

We’re using the default Splunk Search & Reporting App here. You could also use the Brocade Splunk App.

splunksearchingapp

If your server does not currently accept syslog input, add it by going to:

Settings -> Data -> Data Inputs -> UDP -> New

Input UDP port 514 (default syslog port) and Source Type as ‘syslog’

Refer here for more on configuring data inputs on Splunk.

Splunk should now start displaying live syslog events from the switches in its search result.

Search -> Data Summary -> Search Type: syslog

splunksearchsyslog

Use the output of show logging raslog on a switch to identify an event or error of interest from its log string. In this case, we want to dynamically detect and act upon a link flap for which we need to tap syslog [NSM-1003], 48201, SW/0 | Active | DCE, INFO, LEAF2,  Interface TenGigabitEthernet 102/0/48 is link down.

View this code snippet on GitHub.

On splunk, create a search criteria to filter out the event. Here, we use the following:

sourcetype=syslog NSM-1003 process=raslogd | fields + host, interface, syslog_text

splunksearchcriteria

Stricter search criteria are recommended for greater filter accuracy. Once your search pattern is set up correctly, save it as a Splunk Alert with alert type as “Real-time” in order to capture the event live.

Save As -> Alert | Alert-type: Real-time, Trigger alert when: Per-Result

splunkalerts

Splunk provides multiple options for Trigger Actions – Run a script, Send an Email, Webhook. You can run a script that makes a cURL call to StackStorm webhook URL or, for simplicity, use Splunk’s webhook option. StackStorm’s custom webhook URL is:

https://{STACK_STORM_HOSTNAME}/api/v1/webhooks/splunk_link_flap?st2-api-key=XXXXXXXXXXXXXXXXXXXXXXXX

For successful action execution, relevant fields or parameters need to be correctly extracted from the event log message and passed via the webhook call to StackStorm. For example, to be able to auto-remediate a ‘link down’ event i.e. do a “shut; not shut” on the switch, the switch’s IP address and an interface name are must. Splunk’s field extraction capability is useful in auto-generating regexp to pull out the field values from the log string. These “field:value” pairs are then passed in the webhook JSON payload. More on field extractions can be found here.

splunkfieldext

StackStorm/BWC Configuration

A custom webhook rule mapping the webhook trigger to the action workflow is created within StackStorm/BWC.

The custom webhook rule for ‘link flap’ defines the following:

  • Webhook trigger URL: “splunk_link_flap”, complete webhook URL to be configured on Splunk is “https://bwc/api/v1/webhooks/splunk_link_flap”
  • Action Reference: Name of the workflow to be executed when the webhook is triggered
  • Action Parameters (originally extracted by Splunk and passed via webhook):
  • Host: IP Address of the switch/device
  • Interface: Interface that went down
View this code snippet on GitHub.

Rule Definition:

splunk_link_flap.yaml
View this code snippet on GitHub.

Rule Details:

View this code snippet on GitHub.

When the ‘link down’ syslog is detected by Splunk, and the webhook to StackStorm called, the trigger instance payload contains the values for the various Splunk fields and parameters. These include the host IP address and the interface name (as per configured field extractions) along with other standard ones such as search link, search ID, raw log message etc. See this example below:

View this code snippet on GitHub.

These field values can now be directly accessed from the StackStorm webhook rule using {{trigger.body.result.xxx}} e.g. {{trigger.body.result.host}}.

Upon successful enforcement, the rule executes the auto-remediation workflow with execution ID as shown below:

View this code snippet on GitHub.

In this example, the workflow for link flap remediation does the following:

  1. Notify “link down” event on Slack with the Host IP address and Interface name
  2. Pull configuration details for the given interface from the switch and post it to the Slack channel
  3. Try to bring the interface back up by executing “shut; no shut” on the switch
  4. Pull interface details show interface detail from the switch and post the output to Slack
  5. Create a Zendesk IT ticket for the event occurrence and attach relevant logs
View this code snippet on GitHub.

Slack Channel

All code examples used here are available in the StackStorm st2incubator repo on GitHub.

This is just one example of a syslog-driven auto-remediation workflow. Workflows can be custom-built with actions and integrations for your environment. Use Jira instead of Zendesk? No problem! Quickly modify the workflow to use Jira instead! Use Hipchat? No problem, modify your st2chatops2 setup to use that! Easy.

What are you doing with event-driven remediation with StackStorm? Jump into our community and let us know!

The post Auto-Remediation with StackStorm & Splunk appeared first on StackStorm.

StackStorm Challenge At OpenStack Summit

$
0
0

At OpenStack Barcelona? Pick the challenge, win GoPro! Read on to know how.

savekittens

To be a good Stormer is to be a good hacker. Not necessarily in the sense that you enjoy compromising unsuspecting grandmother’s laptop or a random web cam, but that you have a curiosity about how things work. This isn’t unique in the tech industry at large, and many security conferences exploit this curiosity for fun in the form of Capture The Flag (CTF) type problem solver games and various challenges. This year for OpenStack Summit we’ve built our own challenge for people to play. The main reason is because it’s just plain fun, but we also think it’s a great way to introduce people to the power of StackStorm. If you’re bent towards curiosity and if you are a hacker at heart, read on!

Did I mention there are prizes too? Because there are. We have four Raspberry Pis to give away (RPis are also powering the demo at our booth, so you should come look) as well as two GoPros.

No competition would be fair without a few ground rules. So allow me to lay it on you. First, if someone completes all 6 levels before Thursday at 10:00am, they will automatically grab one of the two GoPros. If both aren’t claimed by Thursday at 10:00am, they drop down into the pool of prizes for raffle. On Thursday at 10:00am we’ll take a snapshot of the current standings, draw the raffle, and tweet out the winners. You’ll be free to continue the mission after that, but it won’t help your odds for getting a prize. Prizes can be collected at our booth between 10:00am and 3:00pm on Thursday, and you have to be present at OpenStack Barcelona to be eligible.

Levels 1 and 2 don’t require a running StackStorm instance to complete and are the easiest — so that is our lowest tier and we will be giving away one Raspberry Pi in that group. Levels three and four will have two Raspberry Pis, one for each level to raffle. Levels five and up will have one (or two) GoPros and a single Raspberry Pi for raffle. The higher you go the, more likely you are to win. So have at it.

How do you play? Start at stackstorm.com/challenge and hopefully you’ll figure it out. Everything is communicated through Twitter and Gist. We accept answers in the form of a SHA1 hash salted with your twitter handle: echo -n "MyFancyAnswer@BigMStone” | sha1sum is how you would get it. Make sure to not have any newlines or trailing spaces, because our bot certainly won’t. If you have the correct answer, you’ll get the next level’s puzzle in a direct message (you won’t get the DM if you’re not following @Stack_Storm).

To submit an answer, publicly tweet the hash (generated as described above) to @Stack_Storm with #OpenStackSummit. @Stack_Storm and #OpenStackSummit can be included anywhere in the message: as long as you have the answer hash, the mention, and the hashtag anywhere, you are good. Use the remaining characters to express your feelings, too!

If you get stuck, join our Community Slack (if you’re not there, register at stackstorm.com/community-signup first). We’ll do our best to help without giving away too much, and we have some hints prepared for people who get stuck. Hope to see you at the booth and let us know what you think of the challenge once it’s over. Would you like to see more of this in the future? Make sure to tell us!

The post StackStorm Challenge At OpenStack Summit appeared first on StackStorm.

Auto-Remediation for Real: Self-Healing Clouds and Auto-Remediation as a Service

$
0
0

Oct 27, 2016
by Dana Christensen

What do companies like eBay, VMware, Facebook, Netflix, Mirantis, Verizon, Infosys, Cisco, Dell, LinkedIn, and Apple have in common?

They all attended the Auto-Remediation and Event-Driven Automation MeetUp on Oct. 20th hosted by the StackStorm team at Brocade in San Jose.

The topic for the evening was “Remediation For Real” and both Mirantis and Netflix provided excellent overviews and demonstrations of how auto-remediation is helping to address business issues and increase their engineers’ productivity (and sleeptime!). The power of Community and MeetUps is to inspire and help each other up our game – and the presentations and conversation before and after the presentations delivered.  We saw real-life examples of auto-remediation use cases (and code), shared best practices and key learnings. The presenters all offered excellent suggestions for how to define and implement auto-remediation solutions, including how they decided on the underlying technology for their solutions. StackStorm is the underlying engine for both the Mirantis and Netflix solutions,  and it was helpful to hear why they chose StackStorm rather than attempting to build their own solution.

Mirantis kicked off the evening with a preview of their OpenStack Summit presentation, “Auto-Remediation: Making OpenStack Clouds Self Healing”. Their presentation focused on how auto-remediation is being used to streamline operations of the Mirantis managed Symantec cloud–an OpenStack + AWS hybrid environment. The OpenStack environment consists of four regions, 100’s of racks, with thousands of compute nodes. The AWS environment is rapidly growing, with tens of thousands of cores. The monitoring, metering, and alerting ecosystem includes multiple solutions – most notably Zabbix, Nagios, Prometheus, PagerDuty and Volta. There are three Mirantis engineers assigned to run this relatively large cloud. The team realized that the time they spend working on outages was hindering their productivity in other areas – so they decided to investigate using auto-remediation as a way to streamline and automate their day to day operations.

Watch the presentation to get an excellent overview of the Symantec cloud environment, how the team set about looking at operational patterns that they should automate in their day to day operations; examples of basic and advanced use cases for automating their OpenStack cloud, and key learnings from their deployment:

Netflix was next up to share key learnings about their auto-remediation platform Winston.  As a bit of background, Netflix engineers operate under a full ownership model. Engineers are responsible for architecting and coding the customer experience AND they own deployment and all operational aspects of their service. From a business perspective–engineers have to deal with the conflict between new feature development work that needs to be done to move a service forward vs. the work that needs to be done to keep an existing service healthy. Enabling scale to match the growth of the Netflix business, and focus on SLAs are also key business goals.

Netflix is investing in the development of tools and services for their engineers to increase job satisfaction (eliminating the need to do a lot of mundane, repetitive tasks) and productivity. The presentation provided a great overview of the one-stop portal that they developed for Winston; how they incorporated best practices for things like compliance/auditing, reporting, and security (authority and authentication); and the UI that Netflix developed to provide auto-remediation as a service for their engineers.

The Netflix team shared use cases and helpful observations about rolling out an auto-remediation service. One of the interesting points is that the main barrier to rolling out this solution is not technical–it is cultural. Watch the video to hear suggestions for overcoming cultural barriers you might encounter when rolling out an auto-remediation project:

Key benefits that both Mirantis and Netflix touched on include:

  • Freeing engineers/developers from tier 1 support activities so they can focus on business issues
  • Enabling scale
  • Delivering on SLAs
  • Reduced MTTR
  • Reduced risk of human errors
  • Reduced pager fatigue

It was great to see familiar faces and to make new friends at the MeetUp. As the popularity of StackStorm continues to expand worldwide, we are seeing MeetUps happening in Japan, Australia, London, Los Angeles, and Amsterdam.  We will be sharing key learnings from these MeetUps in the StackStorm community.  To date DevOps has focused on deployment–but with the StackStorm community and the Auto-Remediation Meetup we are taking the conversation beyond deployment–to event-driven automation.

We look forward to continuing the conversation.  See you at the next Auto-Remediation and Event-Driven Automation Meetup!

The post Auto-Remediation for Real: Self-Healing Clouds and Auto-Remediation as a Service appeared first on StackStorm.

StackStorm at OpenStack Barcelona

$
0
0

Nov 01, 2016
by Dmitri Zimine

Lights flashing at the StackStorm-Brocade booth made event-driven automation feel real with the IoT demo entertaining enterprise and DevOps crowds alike. Mirantis’ talk about auto-healing Symantec OpenStack cloud with StackStorm sparkled hallway conversations about auto-remediation. Our guerrilla hacking challenge was success, unveiled that the best hackers are all co-located…(guess where, anyone?)

Read on for details.

OpenStack Barcelona was a few intense and exciting days. News, presentations, rumors and talks over drinks, tech dives, marketplace booths, meeting old friends and making new – it all creates this unique and exciting atmosphere of intense learning and elevated thinking. OpenStack’s direction and destiny deserves a dedicated post.

For us, the best part was to sense StackStorm awareness and adoption. You could hear people dropping StackStorm and “auto-remediation” here and there in lounges, lunches, and hallways. Many of our booth visitors started with “we know you”, and some with “we use StackStorm”.

Booth

Our booth, although modest by corporate marketing standards, was definitely one of the most entertaining.
Now I regret not taking a video of @bigmstone flashing the lights to demonstrate event driven automation…so just look at the photo and use your imagination.

img_6999

The line of lights was controlled by R-PI via REST API. Another R-PI was paired with the color selector. You select a color using nice dials, press a big button, and StackStorm instance takes it from there: a sensor fires a trigger, a workflow cranks and an action turns the lights to the selected color. The setup withstood 3 days of intense use: we only kicked it down twice, and in spite of the little challenge it took to keep the line of lights from sagging down, the lights stood on.

When the lights flashed vividly in an instant response to a visitor tweet about #Stack_Storm, the light bulbs went off in our visitors’ minds: that’s event-driven! And now I may use twitter to trigger troubleshooting workflows in my infrastructure. Joke? Get this: GitHub relies on twitter as a monitoring tool – when unhappy with service performance, developers tweet.

Whether it was due to the IoT demo connecting the dots on event-driven automation, the state of DevOps evolution and adoption, or the growing need for dealing with Day-2 operations, my impression is that we had much more in-depth and detailed conversations on troubleshooting, auto-remediation and other event-driven use-cases, compared to previous years at the OpenStack Summit. Insightful talks with Accenture, Walmart, VW, RedHat, Cisco, NVF players and many others.

Mirantis talk

Mirantis’ presentation hit the the right nerve with the Ops crowd. It was a good joke to schedule “Sleep better at night” talk at 9 am in the morning, but attendance was quite high despite the early hour. As DevOps automation progresses beyond puppet-chef-orchestration and CI/CD, people recognize the need for Day-2 automation and understand that auto- remediation is not limited to root-cause analysis as the problem. The demo, though recorded, was so realistic that it felt live: it killed a network path on a host, simulating a partial hardware failure; Zabbix monitoring triggered an auto-remediation workflow that migrated VMs, opened Jira ticket for a hardware issue, and sent email so that an operator checks the case when he gets up. More use cases presented at the level of detail that only comes from field experience, which led to good hallway after-talk discussions.
img_7065

Hacking challenge

The Hacking Challenge was not something one notices by corporate enterprise, or be mentioned from a big stage. But among the life hackers, who still luckily have a good presence at the summit, it was a killer hit. The challenge was a last minute improvisation: I grabbed prizes on the way to the airport, Matt, Winson, and Ed hacked it en-route to Barcelona, we made up flyers and found a printing shop on the morning of the opening. And oh my, it worked! It has been a parallel reality, with cryptic messages flying around on twitter and guys whispering about saving kitties at our booth…While the first two stages didn’t require much effort, the last stage required some serious hacking. To our excitement someone solved it, and got a GoPro camera! The other GoPro and a bunch of Raspberry-PI starter kits were raffled between the rest of the players.

Here is a full list of winners:

@oLeksee – Grand Prize for 1st to complete – GoPro
@ngubenko – completed the challenge – GoPro
@The_cloudguru – stage 4 – rpi
@John_studarus – stage 2 – rpi
@DatkoSzymon – stage 2 – rpi
@e0ne – stage 1 – rpi

StackStorm hacking challenge winners

StackStorm hacking challenge winners

Can’t help conspiracy jokes when Russians from Mirantis beat all statistical distributions, winning both GoPro’s and 3 out of 7 prizes total. And NO, we didn’t stack the deck: all the challenge code is on github.

Congratulations guys and come play again!

StackStorm, Mistral, and Tosca at Design summit.

Sensing some confusion around the relationship between Mistral and StackStorm, I gave a talk where I shared the history of Mistral/StackStorm growing together, shed light on a common question of “why did we have to re-invent a workflow system” (hint: devops!) and make it clear when to use Mistral, and when to go for StackStorm. Slides are posted here, and by popular demand I plan to sum it up in another blog to help equip users to make informed choices.

TOSCA was another discussion topic. I am pragmatically skeptical on standards, but here I see a case where a user can be a winner: make TOSCA recommend Mistral workflow definition language for defining process models. Why now? NFV gravitates to TOSCA, and, at the same time, proved to heavily rely on solid workflow automation. Nokia-Alcatel use Mistral in their NFV solution, Tacker is picking Mistral for workflow integration, Ericsson, AT&T, and some others are exploring it as a natural solution. The current TOSCA tongue-in-cheek recommendation of BPEL/BPMM is inadequate, and Mistral DSL is a natural and timely choice. We have started the conversation and likely pass the lead to our Tacker friends to drive it to a proposal.

Overall, another great OpenStack Summit for StackStorm team; now back home, energized with some new ideas for building more exciting stuff for your day 2 operations, and beyond.

The post StackStorm at OpenStack Barcelona appeared first on StackStorm.

Execution Time for ChatOps commands

$
0
0

November 7, 2017
by Eugen C. aka @armab

Did you know you can do something like this with StackStorm ChatOps?
ChatOps Command Execution time

Looks simple, but it’s a very useful thing to have in your ChatOps toolset, especially for potentially long-running commands.

This small feature was implemented a while ago, but we didn’t make much of a song and dance about it, so you might have missed it.

  • You can use execution.elapsed_seconds Jinja variable in ChatOps Alias template to get the action duration in seconds (123.4422).
  • Additionally, apply Jinja filter to_human_time_from_seconds and make it readable (2m3s)

Putting everything together in a Jinja expression:

{{ execution.elapsed_seconds | to_human_time_from_seconds }}

And here is a ChatOps Alias example with the Slack attachments API in place:

# Example from:
# https://stackstorm.com/2015/06/24/ansible-chatops-get-started-%F0%9F%9A%80/
---
name: chatops.ansible_package_update
action_ref: st2-chatops-aliases.update_package
description: Update package on remote hosts
formats:
  - display: "update  on "
    representation:
      - "update {{ package }} on {{ hosts }}"
      - "upgrade {{ package }} on {{ hosts }}"
result:
  format: |
    Update package `{{ execution.parameters.package }}` on `{{ execution.parameters.hosts }}` host(s): {~}
    {% if execution.result.stderr %}
    *Exit Status*: `{{ execution.result.return_code }}`
    *Stderr:* ```{{ execution.result.stderr }}```
    *Stdout:*
    {% endif %}
    ```{{ execution.result.stdout }}```
  extra:
    slack:
      color: "{% if execution.result.succeeded %}good{% else %}danger{% endif %}"
      fields:
        - title: Updated nodes
          value: "{{ execution.result.stdout|regex_replace('(?!changed=1).', '')|wordcount }}"
          short: true
        - title: Executed in
          # THIS line
          value: ":timer_clock: {{ execution.elapsed_seconds | to_human_time_from_seconds }}"
          short: true
      footer: "{{ execution.id }}"
      footer_icon: "https://stackstorm.com/wp/wp-content/uploads/2015/01/favicon.png"

FYI: Since StackStorm v1.4 ChatOps integrates with the Slack attachments API, allowing you to produce nicely formatted message responses like you see in the screenshot.

Enjoy!

If you have some cool ChatOps ideas – we’ll be glad to hear from you via Feature Request or meet us in StackStorm Public Slack.

The post Execution Time for ChatOps commands appeared first on StackStorm.

Unleash the power of IoT with Event-Driven Automation (ST2 & BWC @ SuperComputing16)

$
0
0

By Chip Copper, PhD; Principal Technology Evangelist, Brocade
Nov 14, 2016

The pattern is the same: “I can’t because we don’t…”, “But we are also…”, “Can I …”, and finally “How do I get it?”

We’re demonstrating Brocade Workflow Composer (BWC), powered by StackStorm, at SuperComputing16 in Salt Lake City. When you stop by booth 1131, we will give you a token of our appreciation. It’s an NodeMCU microcontroller. It’s pretty cool – WiFi built in, ARM microcontroller, lots of GPIO pins, and it can easily be programmed using the freely-available Arduino IDE. It gets you into the world of IoT quickly and easily, and it’s a lot of fun! Bonus!

But there is a catch – you have to watch a demo to get it.

stackstorm-bwc-iot-supercomputing16

The StackStorm/BWC demo shows how to automate the acquisition and analysis of data from a number of sensors running on a test stand. To start the workflow, all you have to do is message the system. Picture yourself on your way into the office to start the day’s work. To get last night’s data, you open Slack (or your favorite chat tool) on your phone and send the message “!build the graph.” StackStorm/BWC is a member of the chat group, and it responds by telling you that it’s getting started. A few minutes later while standing in line at your favorite coffee shop, you get another chat message from StackStorm/BWC letting you know that the latest set of data is sitting on your virtual desktop ready for analysis. That’s the way to gather data!

So what happened in the background to make that happen? When StackStorm/BWC got the message to get the data, it kicked off a workflow. The first step was to provision the network to allow the data to be read from the sensors. For security reasons we don’t want to leave the sensor data acquisition machine on the data vlan all the time, and so the automation begins by provisioning the network port that is attached to the sensors onto the data vlan.

This where I typically get the first bit of feedback – “I can’t because we don’t use Brocade switches on our data acquisition network.” That’s ok, this is open source software. Even though using BWC with Brocade VDX or SLX switches and network automation suites would make this a WHOLE LOT SIMPLER, you don’t have to. We’d really like you to use Brocade VDX or SLX switches and our network automation suites, you don’t have to. The odds are that someone else has already written the sensors and actions that work for your network platform. If they haven’t, you can do it pretty easily and you can contribute it back to the open source community just as many others have done. This means that you are not locked into any platform or API. You can use whatever you have.

The next step is to validate network connectivity to the sensors. The sensor platform takes a few minutes for the network port to come up. BWC waits patiently until it receives confirmation that the sensors are ready. The sensor platform that we’re running here is based on Linux. Time for the second question:

“But we are also running experiments on dedicated lab devices with their own APIs. I can just interface with them the same way that I’m talking to any network device I want. right?” Right! Suddenly the reality of platform independence comes through. People stop thinking the limitations imposed by brands of equipment and start thinking about workflows that can be built on whatever has been deployed.

BWC then transfers the data file from the sensor platform to the analytics platform using the native scp tools available in Linux, but again, that could be anything available on the platform you are using. Upon completion, the network port for the sensors is once again turned off. Heads shaking. They get it.

“Can I have the analytics chain be driven by StackStorm/Workflow Composer as well?” Yup, that’s the next step in the workflow. The data can be submitted to a work queue, the analytics can be done immediately, or in this case, the data is simply handed off to gnuplot for some quick validation.

The last step is to let you know that everything has been completed and is ready for your examination, and so you get the final chat while you are still in line in the coffee shop. If you like, it can even remind you to put some of that flavor stuff on top of your coffee when it is ready. (I’m not a coffee drinker so I don’t know what that stuff is, but I see people doing it all the time so it must be pretty good.)

Finally, the smile. The person watching the demo gets it. Platform independent workflow management with an interface as easy as a chat session. “How do I get it?” Glad you asked! Just head over to StackStorm.com and off you go. Install StackStorm, and join the community. You’ll be inspired to hear the innovative use cases our community has for StackStorm/Workflow Composer

“And here is your NodeMCU. Thanks for stopping by.” They smile. “Of course, I can automate this with StackStorm/BWC too.” “Of course!”

The post Unleash the power of IoT with Event-Driven Automation (ST2 & BWC @ SuperComputing16) appeared first on StackStorm.

Genomics Sequencing, StackStorm, and Reading the Source Code of Biology

$
0
0

By Dana Christensen
Nov 15, 2016

The DevOps movement is focused on leading transformational change and driving innovation. At the recent DevOps Enterprise Summit in San Francisco, many of the leaders in the field spoke of driving change through a culture focused on collaboration, community, co-creation, curiosity, continual learning, designing for joy, and meaningful work. I have been impressed with the passion and conviction that leaders in the DevOps movement speak about and emphasize these key principles.. Through determination and focus, living out these key values, and recognizing that everyone has a role to play—we will be able to truly unlock the power of technology to address the many complex global challenges we face today.

An excellent example of leveraging DevOps and technology to address complex global challenges is found in the field of Genomics Research. Through focus on the values spoken about by DevOps thought leaders and innovation at the speed of community—Science, Universities, Government, and Business, powered by advances in IT, are able to join forces to develop and evolve techniques that allow for the reading of the source code of biology—something that is incredibly complex and in parts extremely optimized. Through this important work, we are just beginning to unlock the secrets and miraculous mysteries to life on earth as we know it.

genomics-cartoon

Genomics Sequencing Puts the “Big” in Big Data

The field of genomics puts the “Big” in Big Data. In short, it is projected that by 2025 genomics will produce about 1 zetta-bases per year (that is roughly the same amount in zeta-bytes) – following a trend of doubling in sequencing capacity every 12 months. Genomics presents some of the most demanding computational requirements that we will face in the coming decade. These challenges will only be met through the power of community—where generative collaboration, co-creation, trust and sharing can can create an environment conducive to creating solutions that will unleash the power of genomics sequencing.

growth-dna-sequencing

The PLOS article Big Data: Astromical or Genomical? provides an interesting overview of the challenges faced when managing this amount of data.

As discussed in the PLOS article, there are four components that comprise the “life cycle” of the Genomics dataset: Acquisition, Storage, Data Distribution, and Analysis. Each of these domains contains it’s own set of challenges for the community.

In the area of data acquisition, in order to sustain the explosive growth in genomic data sequencing, it is critical to advance the development and application of technologies that reduce cost, increase throughput, and minimize human errors—all of which can only be accomplished at scale with automation. This is where StackStorm fits into the equation.

StackStorm Event Driven Automation & Big Data Genomics Sequencing

I recently had the opportunity to connect with members of the StackStorm community from SciLifeLab, a national center for molecular bioscience, hosted by Uppsala University, Stockholm University, KTH and the Karolinska Institute. The team there is responsible for the development and operations of SNP&SEQ (http://www.sequencing.se), a technology platform that provides sequencing and genotyping services for researchers, primarily within Sweden, but also some from abroad. Together with a couple of other sequencing platforms in Uppsala and Stockholm, they comprise the National Genomics Infrastructure, NGI, which is the largest technology platform within SciLifeLab, and one of the biggest sequencing centers in Europe (https://www.scilifelab.se/platforms/ngi/).

This year SNP&SEQ is expected to produce roughly 500 TB of data – following the industry trend of doubling capacity every 12 months. At SciLifeLab, the team’s answer to the challenge of scaling, while lowering costs, increasing throughput, and minimizing human errors is the Arteria Project (https://arteria-project.github.io/).

The team has leveraged the power of StackStorm as a hub to automate and streamline their complex sequencing workflows. StackStorm has played an instrumental role in allowing the facility to continue to scale with a relatively small team managing sequence operations. By leveraging StackStorm for automation, the team has been able to focus on the specifics of their “business case” rather than building up their own systems and interfaces.

stackstorm-genomics

StackStorm is being used primarily to drive their main processing and quality control pipeline. The implementation consists of microservices running on a local compute farm, and a master node running all the StackStorm components. The current StackStorm use case involves sensors, events, and triggers that automate the processing of raw genomics data through a complex workflow that is quite large and long-running—it generally finishes within 24 hours. When the last reports are generated on the remote super-computing center, there are manual processes that kick off delivery of the data to the researcher, and in the case of human DNA, also runs a separate downstream best practice pipeline called Piper. Incorporating these downstream processes within the main Mistral workflow, so that more of their work gets automated, is work in progress at the moment.

Besides the workflows for sequence processing, the team is also making heavy usage of Stackstorm traces. They tag all their workflow runs with tags unique for each sequencing run, so that they can go back in history and check all associated executions with their custom script .This comes in handy when troubleshooting and also for auditing purposes (which is important for them as they are an accredited facility).

Innovation at the Speed of Community

StackStorm is a powerful event-driven automation platform, which provides the flexibility and autonomy needed for the team to unleash the team’s creativity–providing the freedom to innovate and automate genomic sequencing operations. The team’s work caters to the entire Swedish research community—so serves a very wide variety of research including—Cancer, Cardiovascular disease, and Microbial genetics. A list of publications which have used the lab’s resources is available here.

Two examples of recent interesting pre-prints/publications that have carried out sequencing at their facility are

SweGen: A whole-genome map of genetic variability in a cross section of the Swedish population In which a 1000 Swedish individuals have been sequenced in order to provide a genetic base-line for the population, something which is of great interest to both population genetics, but also in clinical research applications where these samples can then be used as controls.

Complex archaea that bridge the gap between prokaryotes and eukaryotes: A group using the resources were able to identify what is possible the missing link between prokaryotes (bacteria) and eukaryotes (that part of the evolutionary tree to which e.g. humans belong) by sequencing samples from an hydrothermal vent called Loki’s Castle.

As a member of the StackStorm team, it is gratifying to know that StackStorm technology is part of the effort to solve these challenges, and to think of the benefits that this research will bring to millions of people around the world.

Interested to learn how StackStorm can help you address your Genomics or other Big Data challenges? Install StackStorm, join conversation, and help drive innovation at the speed of community.

The post Genomics Sequencing, StackStorm, and Reading the Source Code of Biology appeared first on StackStorm.


2.1 is Coming to Town: Check the New Pack Management

$
0
0

by Dmitri Zimine and st2 team
Nov 29, 2016

Dear stormers and friends!

We are almost ready with a new and quite exciting platform release. This is a big release and we are really looking forward to it, and you should, too – just check out exchange.stackstorm.org to get excited about what is coming. As usual, you will soon see an announcement, a blog describing the highlights of what we’ve done, and a back-story on how and why we did it.

This, however, is an unusual “head’s up”. The changes around pack management we are introducing in 2.1 are so substantial that it’s fair to give you time to review the new features, read the changelog, and adjust your private packs or automation around StackStorm if needed. We’re inviting you to contribute – EASILY – by trying out the new functionality and sharing feedback, or helping us catch last-minute bugs.

The transition notes are drafted here. The full list of changes is in the change log. Please help us improve the docs wherever you find them unclear, confusing or imprecise – we are actively testing as we write this.

The biggest change is introducing StackStorm Exchange. All integration packs from st2contrib, as well as other packs scattered around, are being moved under StackStorm Exchange. Once the transfer is complete, you will be submitting your new packs as PRs against StackStorm-Exchange/exchange-incubator repo, as described in README.md. Making features and fixes on existing community packs will be much easier now that each pack is in it’s own GitHub repo.

stackstorm_exchange

Shiny new StackStorm Pack Exchange.

The other big change is with the pack management CLI, previously known as the “packs” pack. Version 2.1 introduces a new command – st2 pack – to StackStorm CLI. It contains sub-commands for managing your packs and searching for new ones: try st2 pack help for the full listing. The new commands work with the new StackStorm Exchange, as well as with any pack on Github or on private Git servers. We encourage using git: you will see how our new features take advantage of this “each pack is a git repo” model.

The packs pack is still here, but has been changed to reflect this new model. As a result – WARNING! – subtree repositories (repositories containing multiple packs inside the packs/ subdir) are no longer supported. The subtree parameter in packs.install is removed. If you happen to use subtrees with your private packs, they will have to be split into multiple single-pack repositories in order for st2 pack install to be able to install the packs.

There is a lot of advantages to this approach, but what if you prefer a different one? Go for it! The st2 pack command, the packs pack, and the pack management API endpoints only make up an opinionated layer on top of st2 fundamentals, and you don’t have to use it. The fundamentals remain intact: place packs under /opt/stackstorm/packs, virtualenv per pack if they are Python, tell the system to load the content – and your content is running. If your opinion on “how many packs should one repo contain” differs from ours – suit yourself, use your favorite deployment tool to put the content in the right place, register, and run. Do tell us “why” – we are making design decisions based on what users tell us and want to count your opinion and learn from your way of running StackStorm.

Lastly, for all pack writers: please validate your custom packs against these two changes:

  1. The version field must conform to semver (semantic versioning): 0.2.5, not 0.2. If it does not, the pack registration will throw an error. Please check and update.
  2. The name field in pack.yaml should now only contain contain letters, digits, and underscores.^ No dashes! hpe-icsp is no good, hpe_icsp is fine.

Some other changes to take advantage from:

  • Pack metadata file can now contain a new optional contributors field. This field is an array and contains a list of people who have contributed to the pack. These days most of the packs have more than one contributor so author field is not sufficient anymore and we want to give credit where credit is due.
  • stackstorm_version field has been added. It is optional and can contain a semver string which tells with which versions of StackStorm this pack works with (e.g. >= 1.6.0, < 2.0.0, or just > 1.6.0). If your pack relies on functionality which is only available in newer versions of StackStorm you can now specify that and users won’t be able to install a pack unless they are running a version which is compatible with the pack.
  • The pack directory or git repository holding the pack no longer have to be named the same as the pack. Packs are no longer named and referenced by the parent directory or git repository containing the pack: name or ref field from pack.yaml is always used. Name your repository whatever you want (the recommended form for StackStorm Exchange is stackstorm-pack_name).

That’s enough for the head’s up, more details in the docs.

Please give this a try. To install the 2.1dev and try the new st2 pack along with other goodies – pick one of:
* get st2 from the packagecloud.io/StackStorm/unstable repository – despite the unstable in the name, it will get you something that passed automated tests.
* Run the install script on your fresh Linux box:
curl -sSL https://stackstorm.com/packages/install.sh | bash -s -- --user=st2admin --password=secret --unstable
* The easiest – get st2vagrant and run RELEASE="unstable" ST2PASSWORD="secret" vagrant up

We are looking forward to your feedback on StackStorm Slack channel (registration here), at moc.mrotskcatsnull@troppus, or in the blog comments right here.

The post 2.1 is Coming to Town: Check the New Pack Management appeared first on StackStorm.

Dec 8th Auto-Remediation Meetup: Optimizing Operations with Event Driven Automation at Dimension Data

$
0
0

Hi Everyone! It’s time to get together for another great evening to share our latest learnings around Auto Remediation and Event Driven Automation!

We have a very special guest–all the way from Australia! Anthony Shaw is responsible for innovation of Dimension Data global data-centers–and he has lots to say about DevOps, and innovative ways to tackle tough Enterprise challenges. 

Here’s a note from Anthony:

anthony-shaw-screenshot

PROCEED TO MEETUP.COM TO RSVP

In 2016, when we talk about “DevOps”, we’re generally talking about deployment. Continuous Integration, Continuous Deployment.

How many times a day are you deploying?
How fast are your deployments?
Have you automated all your deployment stages?

When you get to the point where you’re doing 100’s of deployments a day does that equate to a seamless experience for your clients?

The real challenge is, your in-house developed application or service probably isn’t the only piece of software your company is running.
Those other apps don’t need 10 deployments a day, the vendor might only release a new patch every 6 months and that’s if you even decide to adopt it. DevOps has been operating in it’s own silo and the gap is just getting wider.

Another problem,
Once you are deployed, what next? Is that it?
When something goes down, what do you do?

How do you integrate that old system that only has a command line interface from the 90’s?

I’ve seen in every Enterprise 10’s or 100’s of point solutions to fix certain problems. We have one tool to monitor VM performance, we have one tool to run snapshots of our storage environments. We have 3 different monitoring solutions!

There was never really anything tying them all together so when you look at auto-remediation or event-driven automation it was just going to be too hard to write all these custom integrations.

This is what StackStorm nails and this is why I’ve been sharing and contributing to the project.

You can design workflows and triggers to bring all of your IT apps and systems into the DevOps world.

Come and join me at the meetup where we’re going to explore some really hard problems that Enterprises are facing and how to really leverage event-driven automation in StackStorm.

See you Dec. 8th!

The post Dec 8th Auto-Remediation Meetup: Optimizing Operations with Event Driven Automation at Dimension Data appeared first on StackStorm.

2.1 is here! New Pack Management and More!

$
0
0

December 6, 2016
by Lindsay Hill

Ta-da! It’s here! StackStorm version 2.1 has been released, and there are some big changes. So big that we started wondering if we should have called this release version 3.0. Pack management has had a lot of work done, and we think you’ll be pleased with the results. Plus good news for those patiently waiting for Ubuntu 16.04LTS support!

Packs Packs Packs…it’s (mostly) all about Packs

The big theme for this release is Pack Management. We’ve upgraded, enhanced, overhauled & refitted pack management, and we’re very pleased to introduce the StackStorm Exchange

stackstorm_exchange

Shiny new StackStorm Pack Exchange.

With this change, working with packs becomes more like the “usual” package management you know from working with development platforms and operating systems. Installing, updating, and managing StackStorm packs has become a smoother, more streamlined experience.

We have a new CLI for pack management (st2 pack), new GitHub organisation, and new workflows. Particularly for our long-time users, you must read the documentation to understand the changes.

Key things to watch out for:

  • Pack names now have to have underscores, not dashes
  • The “packs” pack is now deprecated. Use st2 pack instead
  • Pack versions must use semantic versioning. 0.1.0 is fine, 0.5 is not.
  • st2 pack config only works with the newer config.schema.yaml style of configuration.

All packs in the StackStorm Exchange have been updated to account for these differences. Your private packs may need changes too. It’s a big change – that’s why Dmitri put out the early warning. We’ll also have a follow-up blog coming soon explaining more about the changes, why we made them, and all the cool things you can now do.

Ubuntu 16.04LTS – Xenial support!

So, you started asking for Xenial support about oh, 10s after Ubuntu released 16.04LTS. We’re pleased to announce that it is here at last. Ubuntu 16.04 is now a supported platform, and packages are available. Run the one-line installer, and it will auto-magically sort it out. Or do the install manually. It’s up to you.

Aside: RHEL 6.x/CentOS 6.x is starting to get long in the tooth, and becoming more of a pain to support. How hurt would you be if we dropped RHEL 6.x support in May 2017? Let us know!

Other Bits & Bobs

OK, so this release is mostly about new pack management. But it’s not just that. There’s a few other fixes & improvements too. The full list is in our changelog as always, but the highlights are:

Enhancements:

  • Performance: Speed up short-lived Python runner actions. We’ve re-organized and refactored some code to avoid expensive imports in the places where those imports are not actually needed. You should notice this if you have a lot of small actions being spawned.
  • Add support for default values and dynamic config values for nested config objects. This one is really important for using config.schema.yaml – which you should be doing now. No excuses for not migrating your packs.
  • Improved performance for querying action execution history, with additional indexes and allowing users to supply multiple resource IDs when filtering results.

Bugfixes:

Yes, there were bugs. We’ve been trying the stick approach with the developers, but it hasn’t worked. Given their dietary habits, I’m not sure the carrot approach will help either. Until we find a way to write perfect code every time, we’ll keep doing what we’re doing: fix bugs just as soon as we find them, and add tests so they don’t happen again.

Here’s a few notable fixes:

  • Action parameter names should only allow valid word characters (a-z, 0-9, _). That was always the intention, but let you get away with anything for a while there. No more. Pack registration will fail if you try to use a verboten character.
  • Sensors tried to use a temporary token to access the datastore, and didn’t do anything if that token expired. This caused much confusion when your custom sensor worked for a while, then randomly stopped behaving. Sorry about that pixelrebel.

Thanks to Anthony Shaw, Paul Mulvihill, Eric Edgar and more, for their contributions.

Installing & Upgrading

We strongly urge you to read the upgrade notes before upgrading. Things have changed, and if you blindly upgrade without paying attention you will get caught out. It’s bad enough to strike a bug when you upgrade, but it’s more embarrassing to be caught out by a known, documented change in behavior.

New 2.1 packages are now in the stable repositories. If you’re already running StackStorm 2.0, you can upgrade using yum or apt.

As always, we strongly recommend that you treat your automation code as true code – use source control systems, use configuration management systems. You break it, you get to keep the pieces. This is particularly important for this release where we’ve made many changes to pack management.

Of course, if you have any problems, jump into our Slack Community, and we’ll do our very best to help.

The post 2.1 is here! New Pack Management and More! appeared first on StackStorm.

Innovation at Dimension Data: Taking DevOps Beyond Deployment

$
0
0

December 6, 2016
By Dana Christensen

At the DevOps Enterprise Summit in San Francisco last month, DevOps leaders like Target, American Airlines, Disney, and Quicken Loans spoke of the importance collaboration, eliminating silos, managing/optimizing operations and addressing technical debt, the importance of open source and contributing to community, and continuous learning. I was especially struck by Jason Cox of Disney, who stressed the importance of “creating a culture of courage” where teams are encouraged to be curious, to experiment, explore and embrace change–including process, roles, and technology.

Anthony Shaw at Auto-Remediation meetup

It turns out that these themes are being talked about not only by the large DevOps leaders. These themes are being repeated in conversations around the globe with IT leaders who are looking to leverage DevOps practices to successfully drive Digital Transformation.

Anthony Shaw, Director of Innovation and Technical Talent at Dimension Data has seen this first hand. Anthony not only leads innovation efforts within Dimension Data-but he also travels the world speaking with customers about their business priorities, their Digital Transformation goals, and how the innovative use of technology, organizational structure, and day to day operational practices can accelerate their Digital Transformation journey.

Last week we were fortunate to have Anthony travel from his home in Australia all the way to the US to present at the Gartner Data Center Conference in Las Vegas, and then join us to speak at the Auto Remediation and Event Driven Automation Meetup here in San Jose, where he shared his thoughts on key DevOps Trends & Challenges–and some creative ways to address these challenges leveraging the power of the StackStorm platform.

Anthony’s Observations on DevOps Trends & Challenges in 2016

DevOps Spaghetti: Emergence of Multiple Point Solutions

The Challenge:

With all of his clients, Anthony has found that point solutions have been multiplying exponentially. Enterprises are finding themselves challenged to manage and maintain the 10’s to 100’s of point solutions they’ve implemented to fix problems that have come up over the years. They’ve looked at a problem like monitoring, for example. They’ve picked one tool for monitoring the storage, another tool for monitoring their virtualization layer, and before they know it they’ve got 20 different tools all monitoring something.

What many companies think of as DevOps Barely Scratches the surface

The Challenge:

Anthony has observed that when many customers speak of DevOps, they tend to limit their focus on application deployment. Most DevOps tools on the market today are focused on being able to get the application deployed as quickly as possible, or being able to update the application in production and staging environments 10x, 100x per day. These practices typically apply to shiny new in house developed applications and services. However, for most of these clients, application deployment is maybe 5% of the overall problem. The major challenges are around operational support, availability, and scaling.

Organizational Structures: Centralize or Decentralize?

The Challenge:

Organizationally, when looking at DevOps, many companies are tempted to centralize to avoid a proliferation of tools . They’ll pick one team and say “This is the DevOps team,” and they’ll write a process and a procedure for that team. The problem immediately becomes that that team becomes a bottleneck.

The other option that companies look at is de-centralizing. They will provide DevOps training for the teams, but each team makes its decisions around which tools they use. The issue quickly becomes lack of standardization. This results in a lot of disparity between teams about what gets used with minimal to no collaboration or communication between the teams. Teams tend to pick different tools, and different approaches. Documentation, sharing of knowledge and problem solving between teams is typically quite scarce. These emerging silos and lack of collaboration quickly becomes a barrier to moving forward in the client’s Digital Transformation journey.

Summary:

As companies look to implement or expand their DevOps practices, Anthony recommends they look to break down silos, collaborate, manage technical debt, and optimize operations with event driven automation.

During his presentations at Gartner and at the Auto Remediation and Event Driven Automation Meetup, Anthony shared how Di Data has been addressing these challenges, and what he best practices recommendations he makes to customers. Anthony spoke of how they are leveraging the power of the StackStorm platform to overcome these challenges and optimize the data center. You can view the recording of Anthony’s presentation here.

The post Innovation at Dimension Data: Taking DevOps Beyond Deployment appeared first on StackStorm.

Quick update: v2.1.1 published

$
0
0

December 20, 2016
by Lindsay Hill

Just in case you missed it, we published StackStorm v2.1.1 late last week. This is a minor update, on top of the major changes to pack management we made with 2.1.

There’s a few small bugfixes and enhancements:

  • core.http now supports HTTP basic auth and digest authentication.
  • Local action runner supports unicode parameter keys and values.
  • Improved error handling and more user-friendly messages for packs commands and APIs.

Full details in the Changelog.

This is a recommended update for all v2.1.0 users. Use yum or apt to upgrade your system. If you’re not yet running v2.1.0, make sure you read all the Upgrade Notes. There were some significant changes in v2.1, and it may break your custom packs. It’s worth it though, we promise.

The post Quick update: v2.1.1 published appeared first on StackStorm.

Viewing all 302 articles
Browse latest View live