Our blog data becomes even better

A few weeks ago we published an overview about all the things Twingly can do for your business. As you can read there, our main focus is and has always been to help websites such as online shops or media outlets to open up to the blogosphere and to connect with blogs in order to increase their reach and to offer additional context and information to their visitors. But in the past years, a second field of activity has been gaining importance for us: Providing media monitoring companies and other organisations with data from the blogosphere.

We figured that if we index the blogosphere for our blog search engine and our widget solutions connecting websites and blogs, we could use this data in order to enhance and improve companies’ knowledge about what’s hot and trending in blogs. So we launched our API offerings. Last year we explained in detail our offerings to media monitoring companies which comprises of three different APIs.

Our goal is to have the best blog data in Europe in terms of coverage, quality and immediacy.

So what are the aspects we are working on to provide our API clients with even more compelling and complete insights into the world of the blogosphere?

For one, in terms of overall coverage, we are since a couple of months back building new providers for local blog hosts in Europe with focus on the Nordic countries. In the graph below you can for example see that we deployed a provider for a Norwegian blog host in the beginning of October. We have also deployed one for Finland last week and you can spot that the Finnish data is about to increase.

Both in the above graph, for the Nordic countries, and in the graph below for Total, Swedish and English data you can see a big bump in the beginning of September. This is due to deployment of other different providers. They usually find, for us, unknown blogs and index them back in time and therefore the temporary increase of data. Apart from providers we have also built other indexing robots, which together with the providers widen our coverage over all. In the graph below for Total, Swedish and English data you can see that the average total amount of blog posts per day has increased from just above 500.000 per day to over 600.000 per day.

Another improvement has to do with immediacy. A lot of our customers desire to get new posts as soon as possible after being published. Therefore we have built a new indexing algorithm that looks at how often a blog publishes, categorizing it in different indexing cycles ranging from being indexed every 10 minutes to once every month. In this way we can index new blog posts much more efficiently. The glitch in the indexing in the beginning of November is due to deployment of this algorithm.

Having a closer look on the aggregated blog data also means that we at Twingly have lots of insights about different industries, areas of interest and trends. In order to show the potential of our data, we recently published two reports, one about the power of bloggers in the book e-commerce market, one about the power of bloggers within broadcasting and one about how blogging engages the fitness world. All were featured in different media outlets and blogs and can be downloaded here: twingly.com/reports

More than ever, data quality is our focus. We’re constantly adding new features, initiatives and providers to ensure that we are the best blog data service in Europe, and the world. Our product development is all based on customer feedback so if you have thoughts, ideas or a problem – please let us know. And if you aren’t being provided with blog data from Twingly yet, we would love to hear from you: info@twingly.com.

The Swedish Armed Forces relaunch their blog portal supported by Twingly

blogg.forsvarsmakten.se


A few days ago, Försvarsmakten, The Swedish Armed Forces, relaunched their blog portal blogg.forsvarsmakten.se where members of the Armes Forces blog about their everyday life at work – both in Sweden an abroad. For the new version, Försvarsmakten decided to use Twingly technology in order to find blogs from around the web that discuss topics related to the Swedish Armes Forces and their assignments.

The new blogg platform looks extremely neat, and the Twingly technology working in the background makes for a great user experience for visitors of blogg.forsvarsmakten.se who want to learn about related issues discussed by bloggers around the web.

You can see that in action when going to blogg.forsvarsmakten.se/omvarldsbevakning. What you see is a pretty cool monitoring section which shows content from Twitter, Facebook, Google+, YouTube, internal blogs and external blogs that is related to the Swedish Armed Forces. If you uncheck all the boxes on top except “Externa Bloggar”, you are being presented with the latest blog posts from the blogosphere relevant to those who have an interest in the Armed Forces.

We are proud that Försvarsmakten trusts Twingly as the provider of complete blog data in order to make the global blogosphere accessible to everyone interested in topics related to the Armed Forces.

“The biggest challenge with Big Data is to stop focusing on Big Data”

Every second, a huge and every increasing amount of data is published on the web. Gavagai, a Twingly Data client based in Stockholm, has developed a Technology to read, aggregate and understand this content. Fredrik Olsson, the Chief Data Officer, gives some more insights into this fascinating business and about what the startup is able to do with the blog data it collects.

At Gavagai, you do some sophisticated stuff. Please tell us in a few sentences what your business is all about?
It’s about continuously reading tremendously large and dynamic text streams, and delivering timely, and actionable intelligence based on the aggregation of information therein. Of course, what is actionable depends on the information needs you as an actor in a particular domain have, be it brand management, assessing threat levels for targets-at-risk, or keeping track of the sentiment towards a particular tradable asset. Example information needs that you are able to address using Ethersource, our system, include:

* How is my brand perceived in comparison to those of my competitors’?
* Why are my customers unsubscribing from the services that I’m offering?
* When is the best time to launch this particular advertising campaign?
* How is the campaign, recently launched by my competitor, received
among my target audience?
* Where is it most likely that the on-line protests against a certain
phenomenon will be publicly manifested in terms of a demonstration?

We have a number of case studies available at our blog.

Fredrik Olsson

What’s the founding story of Gavagai?
Gavagai was founded in 2008 by my colleagues Jussi Karlgren and Magnus Sahlgren, as a spin-off from the Swedish Institute of Computer Science (SICS). Gavagai was formed as a response to the many inquiries Magnus and Jussi received from people outside SICS regarding their research. Gavagai has been operational in its current incarnation since late 2010.

You are one of Twingly’s Data clients, that means you are using our API to access data from Swedish and English speaking blogs. Why do you need this information and what do you use it for?
We read data from Twingly 24/7. In particular, the Twingly live feed gives what we believe to be a very good coverage of Swedish blogs, which of course is very important to us in meeting the kinds of information needs outlined above, expressed by domestic actors.

Do you have any insights about this data from Blogs in Swedish and English you want to share? Some surprising fact or observation?
One epiphany we had some time ago was that we’re now able to aggregate and inspect attitudes and opinions of a population as a whole, that’s not necessarily visible in any of the parts. For instance, we can clearly see that Swedish bloggers are optimistic during holidays and weekends, something which is very hard to assess from the posts of any one individual. Analogously, we also pick up on aversive or hostile tendencies in the online population towards a given subject, but where it is hard to identify all the facets of the tendency in any one individual. For example, we recently set up a Xenophobic Tracker using, among other things, the Swedish blogosphere as input; the propensity of violent expressions in that context is not a pretty read.

But it’s not the peak items that we’re most pleased with. With Ethersource, we can pick up and note weak signals and tendencies where other methods fail.

What type of companies or organisations use your services?
The kinds of actors that require actionable intelligence in their efforts to manage brands, make informed decision based on the ‘temperature’ of an on-line population as a whole, keep track of the general mood in the markets, or trade with specific assets.

Your titel is “Chief Data Officer”. That’s not too common, is it? Do you think every company will need a CDO in the future?
No, I don’t think every company will need a CDO in the future. Hopefully, companies will be able to scale down on their data management activities, perhaps due to their use of tools and techniques such as Ethersource, and instead focus on their core business. Much the same way we are able to focus on our core business by obtaining data from Twingly instead of harvesting it all ourselves.

Big Data is one of the hottest buzzwords right now, which is a field you are active in. What’s the potential and biggest challenges of the increasing amount of data?
We’re currently concerned with human-generated text, so it is in the light of that the response to this question should be read.

The biggest challenge with Big Data is to stop focusing on Big Data. Big Data will, by virtue of the prevailing definition, always be slightly too big to handle with common tools. This has mainly resulted in people being obsessed with processing speed and ability to store large amounts of data. Few, if any, have focused on a layer in the so called Big Data Stack that so far has been missing: the Semantic Processing Layer. The key challenge for Big Data is to come to the point where it is easy and swift to turn massive data streams into actionable intelligence; knowledge that you and your organization can act upon in order to obtain a competitive advantage. To put it another way; the key challenge of Big Data is to be of service.

Being a researcher by training and heart, I believe that we’ve yet to imagine the biggest potential there is in harnessing truly Big Data. Let’s talk about that in a few years, when a more representative sample of our world’s population is active on-line. Then, we’ll be able to find the collective answers to questions to mankind, that we’re not able to think of now.

What’s on your roadmap for the upcoming years? Where do you see the biggest growth and potential for Gavagai?
We’ve got very exciting times ahead of us! Ethersource is already unique in the way it is able to read amounts of text that would overwhelm traditional language processing methods, handle multiple (all) languages, in real-time, and learn from variations in the input in an unsupervised manner.

Our development plans involve some fairly hefty stuff. In the short term, we’ll roll out a game changer in terms of a way of identifying the many meanings of a given concept, and use that information to disambiguate expressions of that concept as they appear in social media. For instance, imagine that you are a brand manager for Apple, Visa, “3” or some other brand with an inherently ambiguous and common name: How do you go about monitoring the attitudes and opinions towards the meaning of the word that constitutes your brand, and only that meaning? There is a solution…

The biggest growth and potential for Gavagai is as a supplier of the Ethersource technology to other companies, such as analytics firms, trading desks, governmental agencies etc, that already have an infrastructure in place, but that lacks the competitive edge the ability to understand and make sense of large text streams in multiple languages gives. Ethersource is an implementation of the Semantic Processing Layer of the Big Data Stack, and we intend to move it as such.

“The way we share and what kind of content we share will evolve”

The more content people publish on social media sites and blogs, the more important it is for companies, brands and organisations to monitor what’s being said about them on the web. There is a huge number of Social Media Monitoring services to choose from. Many are using Twingly data about the blogosphere, such as Sweden-based Lissly. We had a chat with Simon Sundén, one of Lissly’s co-founders, about what’s happening at Lissly, what’s to expect in the upcoming month and where he thinks social media is heading.

Please give us a quick introduction of Lissly. What’s the company background and what kind of services are you offering?
Lissly is a social media monitoring tool which you can use to monitor what’s being said on social media any keyword or phrase. We launched our tool in October 2010 and are based in Sweden. Lissly focuses on providing the best monitoring for local markets and languages, which often isn’t that easy with other tools and services. We worked hard to have the best data for Sweden and now we are expanding to other countries & languages in Northern Europe.

What are the main differentiation points of Lissly compared to other Social Media monitoring solutions?
We know the local market and offer monitoring for local languages, especially in Northern Europe. Lissly is also a very easy to use, we like to call it “Social Media Analytics for the People”. But of course you can always go in-depth and get detailed data.

Is there any feature that in your eyes is especially good or useful, that you want to highlight?
Of course everything in Lissly is awesome, but our Forum Monitoring as well as Related Words are some key features I personally like a lot! Currently we monitor a majority of all forum activity in Sweden, including the largest forums in Sweden as a total as well as within each niche. Related Words is a feature where you directly can see what related words & topics are connected to your keyword or project.

What is on the roadmap for the upcoming 12 month? Where is Lissly heading?
We strive to have the best quality on every single language in the Nordic region (Swedish, Danish, Norwegian, Finnish) and plan to expand to other markets & languages. That also means that we will add a lot of new sources. Every language and country has its own important blogs, forums, social networks – we will allow monitoring of all of them. Other upcoming improvements include an iPhone app that we plan to release in autumn, enhancements to our API, features to show more information regarding each mention (retweets, shares, likes, views, ratings etc.) to better understand the social impact and better functions for bookmarks, notifications and mail reports.

You are using Twingly’s API to collect data from the blogosphere, so you have a rather good insight into the world of blogs ; ) What are your thoughts on evolution and future of blogs?
Yes, Twingly’s API is one of the sources we use to gather blogosphere data and we really like it. Concerning blogs: They have “survived” many years and I’m absolutely positive that they will continue to be an important part of the social web in the future. What will evolve is the way we share and what kind of information we share – with better mobile connectivity and easier services like Tumblr we will see a lot more of picture, video and other media type sharing than plain text. Much of the blogging today doesn’t happen on what we typically call a “blog platform” like WordPress, Typepad or Blogger but rather on video sites, sharing sites etc. We see a lot of video blogs on YouTube, picture blogging, sharing on Tumblr and so on – this is also blogging and I think that this will increase in the future.

Where do you see social media in 2-3 years?
In 2-3 years we will not talk about social media anymore but rather the social web. It’s already becoming harder and harder to find sites on the web that aren’t social. I have a feeling that we are moving towards a web where we increasingly will be dependent on our social identity. This will be the basic platform where all our social activities are tied together – you will use it to comment on sites, register for forums, play games and so on. We already see this today with services like Facebook and Google, but as more sites implement social functionality the amount of information connected to our social identity will grow.

Media Monitoring Companies Using Twingly (Part 2 of 2)

Last week we started to present social media monitoring services and research companies that use data about the global blogosphere collected by Twingly. Today we continue with this overview. If you haven’t seen the first part of the list (where we also explain the two APIs that we offer to our partners), you find it here.

Radian6
Radian6 is one of the most popular and best known social media monitoring services in the world – and a client of Twingly, accessing our blog data for integration into their monitoring tools. The Canada-based company was founded in 2006, focuses on businesses and provides them with tools to listen, measure and engage in conversations across the social web. Radian6 has over 1.700 clients worldwide.




Notified
Notified is a Swedish service for social media monitoring that aims at providing its clients with a very intuitive interface and tries to simplify analytics and statistics to make them as clear and easy understandable as possible.

Read more about Notified (in Swedish)

Retriever
Retriever is owned by the Swedish news agency TT and offers different kinds of media analysis, monitoring and research services. With “Pulse” the company has its own social media monitoring offering. Since 2009 they are working together with us to get Twingly data about the global blogosphere.

Read more about Retriever (in Swedish)

Infopaq
Like Retriever even Infopaq has a broad focus on all kinds of media monitoring, news evaluation and analysis. One part of their service includes monitoring of what’s being that on the social web, inventory analysis and even campaign analysis. The company has about 6.500 clients and 500 employees across the Nordic countries, Estonia and Germany.

Read more about Infopaq (in Swedish)

Imente
Even though Twingly is based in Sweden our data covers the global blogosphere. Imente is a Spanish provider of media analytics and monitoring tools that connects to our API to use Twingly data for its social media monitoring services.

Media Monitoring Companies Using Twingly (Part 1 of 2)

Here at Twingly we aggregate a lot of data from the blogosphere since we are crawling blogs worldwide for our blog search engine. Apart from users can being able to find and discover content from blogs, we are working together with a couple of social media monitoring and research services that are using Twingly data for their offerings.

Before we have a look at who these partners are here is a brief description of the two APIs (Application Programming Interface – the way that external sites connect to Twingly data) that we are providing to our data partners:

Analytics API: The Analytics API is based on our blog search engine, comes with a visual search interface and allows for accessing blog content published during the past 4 months.

Livefeed API: This API gives partners access to all raw data our crawlers collect from the blogosphere, separated by language, as XML feed and without any delay. The Livefeed API is more extensive than the Analytics API. Partners choose one of the two APIs depending on their specific data needs.

Every company mentioned below is using one of those two APIs. If you are interested in becoming a Twingly data partner we are glad to hear from you. And if you after having read this postiding did become curious about the data we are collecting from the blogosphere, head to our search engine, try it yourself and maybe start using it for your own personal social media monitoring (here we explained how to do that).

Meltwater Buzz
Meltwater is a global player within the field of news and social media monitoring, serving more than 18.000 clients in 25 countries. With the “Buzz” product Meltwater offers companies and organisations tools to monitor and analyze what’s being said on social networks, microblogging services, video and photo platforms, forums, blogs and other sites based around the concept of user generated content.

Click to enlarge

Read more about Meltwater Buzz (in Swedish)

Silobreaker
Silobreaker was founded in 2005, is headquartered in London but has its development team in Stockholm. Apart from a free news search Silobreaker offers media monitoring based on statistical and semantic analysis to corporate, financial, NGO and government agency users, and monitors content from old, new and social media.

Click to enlarge

Read more about Silobreaker (in Swedish)

FindAgent
London-based FindAgent provides its social media monitoring services to companies and brands who want to succeed with their digital marketing initiatives. One of FindAgent’s focus areas is blog monitoring, both in Sweden and on a global scale. The company has developed a technology which tries to understand the meaning of the content monitored and lets its customers ask questions that automatically are being answered.

Click to enlarge

Read more about FindAgent (in Swedish)

Nobicon
Nobicon is a company from Sweden specialized within the field of media monitoring providing organisations with extensive data on what clients, competitors, investors and other stakeholders are doing, saying and thinking. Nobicons monitoring tools can be integrated into the client’s intranet, website or ERP system.

Click to enlarge

Read more about Nobicon (in Swedish)

We will continue next week with part 2 of the list of Twingly data partners!