Weblog
PLOG Thursday RESTapi current status
Timo and friends talk about the current status of the REST api at Plone Open Garden Sorrento. Plus a bit of general javascript talk.
RESTapi current status
Timo started with some proof of concept implementations. See https://github.com/plone/plone.restapi
If it would not work with for example the ZPublisher, then that would be bad, so we should look into that. Let it support http verbs, like POST, GET, PUT, DELETE, instead of assuming it is a webdav request when it is not POST or GET.
Aren't people moving away from that, just using GET parameters? Staying close to REST seems best, Angular and other frameworks can handle it. Workflow actions will use POST anyway.
You will always transform stuff between the saved data and presented data, like saving a uuid and presenting a normal url. You save something and then may get back an object with different values.
Several levels of RESTfulness.
- Resources
- RPC calls
- HTTP verbs
- hypermedia
If we only go for the second level, we could just use the json api. We should play around with the third level, to see if we can make it work.
There is a risk that we break webdav when we fix the ZPublisher. We may have to do that. Webdav works according to some, is buggy for others, or not working at all. For webdav you could look at Accept headers, or discover webdav in some way like that.
Take a look at the dexterity transmogrify code and see if we can take some export code from that. Also https://github.com/collective/plone.jsonapi.core. And look at json schema.
We thought about authentication, but the first phase is just about reading. In a web browser the current authentication will work fine. For non browser visits we need something else, but that can be done later.
The edit schema may differ from the add schema or the view schema. David Glick has written code in javascript for creating a form based on such a schema, using ReactJS and ReactForms.
So we may not need z3c.form then. But z3c.form also does data transformation and validation, you would still need that. If your schema is defined in json, you could use some json schema handling and validation in the backend as well. That is long term.
If you GET a page, you want a json with all the data you might want to see there, so title and fields of this object, list of items if it is a folder, portlets.
Timo: I have limited time to work on this. I do have a project where I am using it. Good if it can work on Plone 4.3 for this. But if it would only work on Plone 5 that would not be a deal breaker.
Hypermedia is there: you can click through the site with json. The json exposes other urls that you could click.
There is a live demo linked on the github page: https://github.com/plone/plone.restapi. You can install a Mozilla json plugin to look at it.
If companies would be willing to give developers money or time for this, that could be helpful. Maybe there is appetite to pool resources. The API design needs to be done before we can ask someone to really dive in and commit. It could feel strange that one person gets paid and others work on it for free, although I wouldn't mind, for me it is a lack of time.
Javascript front-end
Good to get some options out there, get understanding with more people about what we are actually talking about, so if we make a decision it is more informed, and knowingly agreed upon by more people. What are limitations of Angular, React, Patternslib, etcetera? What do we expect from a javascript front-end.
Plone Intranet is using Patternslib and it will be around in 2020.
People will build multiple javascript front-ends for Plone, with whatever framework they like.
Can we come up with a matrix of several frameworks in the next session?
[Well, we tried, but your note taker gave up.]
PLOG Thursday morning talks
Talks at the Plone Open Garden at Sorrento.
Report-out from yesterday afternoon sessions:
- ZCatalog: ripping it out? Setup performance test first, to get baseline understanding of where issues are. Make ZCatalog more pluggable. Enhance visibility of collective.solr in our documentation, good for larger deployments, improve its documentation.
- New plone.org work: halfway finished, most content that we want to keep can be transmogrified, css and responsive tweaks needed, personal profile pages with metrics to show what you are doing as member of the Plone community. PloneSoftwareCenter for the add-ons will be lost. Perfection is the enemy, it would never be finished then. Use collective.roster and collective.workspace. Plone 4. Started a working group to manage this and see it finished: Víctor, Gil, Christina, Fulvio and me, meeting every two weeks. Some of you may be called upon to help. Give us a shout if for example you know about LDAP.
- Marketing/positioning: listed some competitors, talking about sectors, different ways we use Plone. Plone is obviously a very good CMS, but also a platform, a set of tools, that can be used to serve various targeted needs. Unique set of features. Discussion will continue this afternoon. [Fred will present some ideas / pep talk this morning.]
Víctor Fernández de Alba: Activity stream and conversation engine (MAX)
Víctor, Carles, Ramon. We have built a WhatsApp like application on Pyramid. Website. Push notification to app on phone.
First commit in August 2011. Initially designed as key feature for BarcelonaTech university concept of social intranet. But limited resourced for this. We did it bit by bit.
Activity stream with the basic key concepts of activity, comments, likes, favorites, upload images and files.
Conversations: one on one, in group, also with images and files. Doing this realtime using STOMP RabbitMQ plugin.
Push notifications to IOS and Android, including apps for those two.
Aggregate external sources: Twitter.
Authentication is done with oAuth 2.0. Resource Owner Password Credentials workflow using a server we created: osiris. See https://pypi.python.org/pypi/osiris
Activity stream stores activity from users and applications. Application can 'impersonate' a user to feed the stream with useful information, for example when uploading a file somewhere. MongoDB for storing info.
Subscriptions are made against contexts, something with a unique URL. Everything aggregated on the timeline of the user.
Realtime conversations and private messaging, going to RabbitMQ, then to conversation queues, then via an API (WSGI client) to MongoDB. Support for sending images and files too.
Infrastructure. Front-ends: MAXUI.js on IOS and Android. oAuth server. API. MongoDB plus RabbitMQ. Queues and consumers designed for huge loads.
API has 88 RESTful endpoints. Over 600 tests. Powered by Pyramid, optimized for gevent.
Performance: 4000 concurrent users: 100 messages per second.
LDAP integration. Deploying it to Moodle.
What's next: follow people, contextless activity streams. Documentation not only in Catalan.
Resources:
Please give feedback. Maybe you want to use it? If we get enough interest, we can convince our managers to let us go to EuroPython. ;-) We would like to get traction to continue to develop this.
Alexander Pilz: Design First Driven Process
The Plone Intranet Consortium is using this. Let's explain how we look at it, how we define it. Give some insight.
Why is design first necessary?
Part 1: the backend
The developer's fun is the integrator's nightmare.
New UI with Plone as backend: you end up with two separate projects: a backend project (Python) and a frontend project (css and javascript).
The Traveling Integrator Problem. Plone alone works fine. Each add-on you add increases the complexity. The more add-ons I install, the more I need to redo the user interface: one adds an action in a viewlet, another in a portlet, another in a content menu dropdown, so the end user does not know how to use them all.
People do not want to work with an intranet, they have to.
Design is not something on top of add-ons and Plone. It is something that should be integrated in the whole. For that to work, you need to start with the design.
In the Intranet Consortium we started with six companies who were using eight workspace implementations.
Part 2
- Platform approach: an add-on that is as flexible as possible, the least assumptions.
- Product approach.
- Developers approach: first build it, then make it nice.
- Designers approach: make it nice, then develop for it. Emphasis put on the end user.
"Design First" process. Product owner and designer (and only optionally a developer) sit and detail the requirements. You need a designer who can do html and css. A patterns library so that designers can add UI behavior, interaction design, even though nothing yet happens on the backend. You directly use the actual design as the diazo theme.
It is hard to keep developers from designing. They actually like doing that a bit. You have to hit them on the fingers: don't make design decisions yourself, talk with the designer.
Visual versus interaction design. Designers are not just there to make things nice. Theme is the visual design. But there is more: interaction. Usability, user testing. Theming is okay for websites. For intranets there is much less need for branding: you do not need to be reminded for which company you work.
Fred: Branding
Framing effect: what you see, influences what you think later.
I am not a marketeer or communication specialist (anymore), so I shouldn't give this talk. This is just an intro, not a lecture.
You can look at sectors, audiences, regional, etcetera, but it gets complicated. Maybe we have to agree to disagree. Also the wording, top down: marketing, strategy, branding, positioning, campaigning, advertising, communicating, copy writing. Look up a bit. Get out of our comfort zone.
Brands give you emotions and create an identity. They frame you. You see someone and you have an immediate idea: is this person nice or not.
Selling cars. How? They can try to sell horsepower, color, style. Or safety, speed. But deeper down: freedom. Cultural differences. BMW: fun, experience. Audi: technology. Those messages are usually not there explicitly, but it sifts through. Foundation on which they put their other marketing and communication expressions. You don't have to be vocal about your brand identity. Should be consistent, not contradicting. If Paul suddenly shows up in a business suit, I would ask: where is Paul.
This is not about features. Does Paul have portlets? Do I have tiles?
You cannot pick an identity and be done. Time frame of three to five years where you cultivate it. Maybe we already have an identity.
How to describe it. Mission statement, About Us section, mostly describing what you are actually not in the case of corporations. Slogans, persona. Archetypes. No, not the next version of dexterity, but Jungian Archetypes. Twelve in total, like outlaw, sage, magician, lover, hero. An abstraction. Independent of culture: hero evokes the same emotion in America as in Asia.
Example: a sage. Finding truth, wisdom. Weakness: study details forever. Think of Gandalf, Yoda, Dumbledore. BBC: if someone hits a co-worker, you fire him, because it damages your image.
There is more documentation on Jungian Archetypes than you can find on Plone.
This maybe touchy, feely, fuzzy stuff. We are tough, rational developers. So we might need other specialists that actually see this as an exact science.
Back to Plone. Problems have since 2004 been documentation and marketing.
Global positioning or global marketing is not bad, but we are too diverse. If you choose one niche to target, you loose another one.
Try to find a more generic but globally recognizable Brand Identity. Seek help. Derive a brand strategy from this identity. Sharpen the identity that we already have. Focus. Gravitational areas. Schwerpunkte in German. Even distributions can have their own sub branding with aligned personalities. A feeling of what is important in that sector. Look at a competitor, how are they positioning themselves, and can you take the opposite side?
Look at the Plone logo. Close your eyes and be quiet for ten seconds. What do you feel? Probably not much. We have not attached a brand to the logo yet.
A new plone.com and plone.org is nice, but we also need a brand.
Danger of ping-ponging between 'we need to focus on this' and 'no, on that' or 'we do not know how to do this and have no time and money'.
Edward de Bono writes about group dynamics. Imaging six hats with different colors. You can say in a meeting: 'put on your black hat and look at the bad parts' and later 'put on your yellow hat and look at the nice parts', or 'blue hat to look at process'. If six colored hats are in the same discussion at the same time, you will just go back and forth and do not end up anywhere, just go around in circles. This is true for marketing discussions and for technical discussions about our backend or frontend. See http://en.wikipedia.org/wiki/Six_Thinking_Hats
'Plone gives you peace of mind' may sound good to Europeans. Americans may like more 'Plone is stable 24/7.'
Who are you? Mira Lobe / Susi Weigel: I am me. Read it here, nice story: http://www.jungbrunnen.co.at/data/medialibrary/2013/04/Ich_bin_ich_englisch.pdf
JC Brand: Patternslib, Mockup
Patterns make it possible to first create an interaction design, without creating backend code first. Cornelis Kolbach creates designs for sites that still look and work fine without javascript. The patterns are there so it works more nicely or faster when you have javascript enabled, but it is not in the way. It allows a designer to create an interactive design without needing to write your own javascript.
I have replaced mockup stuff with patterns stuff. Mockup is now a collection of Plone-specific patterns, for example for the query string widget used in collections. Patternslib has the non Plone-specific stuff. There is some duplication that we want to resolve. That is the direction in which we are going.
Patternslib is very lightweight. Not a 'kitchen sink' framework. Trying to keep it as vanilla, plain javascript as possible. Lots of third party javascript libraries could be easily integrated. AngularJS maybe not that easily.
Isn't it yet another too Plone specific thing? Patternslib was developed outside of Plone, by Cornelis and Wichert. It has zero dependency on anything in Plone. It is simple conceptually. There is not that much to maintain.
You should usually reuse patterns, instead of creating your own. You may need a new one from time to time, but if you have that often, something is wrong.
Time: not to start a pillow fight, but Angular does much the same thing. But where do you see Mockup in three or four years? In combination with a json api?
JC: We have patterns in Plone now and might as well use them. We need to inform people about how it works, because you sometimes see a pattern where you think: this misses the point. If we use it, we might as well use it in a sane way. Patternslib is not a competition for Angular. We do not want complicated javascript in Plone, which is first and foremost a Python framework. No closely tied javascript client to the Python server.
Alex: Patterns is a simple solution that works at the moment. And it puts us more in the correct javascript mindset. That mindset is more important than the exact technology. Plone as framework should be ready for a pure javascript front-end, whatever it is.
Timo: Three parts that we should do. Cleanup backend, write RESTful api, get some javascript frontend.
Paul: Keep in mind how these things would work for add-on writers. Do they need to learn alien technology or rewrite their add-ons every three years? Also, Mockup documentation should somehow be moved to docs.plone.org.
Timo: Patterns are there in Plone 5. First thing is RESTful api, but that does not change anything yet.
Roel: On the Four Digits Anniversary Sprint we will have an API track, good to have people there who want to work on it.
PLOG: new plone.org work
Víctor Fernández de Alba talks about the work on the new plone.org site that is in the works.
It is on new.plone.org, but you need to login immediately. Your plone.org account does not work yet.
Does it still make sense to do it in Plone 4 instead of Plone 5? Eric: let's get it out there soon. So Plone 4.
I have been looking at lot at the theme. I am now bored by it. Do you feel the same? Should we create a more modern theme? Still lots of columns. Moving banner / carousel. Seems still good.
Paul: It should say: the community is great. Community news, sprints, big button for the news letter.
Display some technical metrics, like number of committers, pulling info from github, twitter, etc. Refresh them nightly.
Philosophy behind it? Community site, for contributors and users. Profile page for users. Reach out to existing and new community members.
Look at mozilla.org. Tell people how to contribute.
There needs to be a spot for say the Plone Foundation, where the boring bits like meeting notes get stored. So kind of a traditional site, not brochureware.
On top a bar, same as for the national Plone sites.
Dropdown menu: maybe not.
Get across that people are constantly working on it. Some activity stream?
What are main issues that still need to be done?
- Profile pages, badges, from collective.roster, integrate it. collective.workspace (for teams) can help too. Badges for being on a team.
- Design those badges.
- Big photos planned for profile page. Aren't developers too shy for that? I expect bad photo quality. People will submit photos of kangaroos. With small photo you get a lot of empty space. We may need a redesign of this page.
- Theme related issues.
- Finish contributor and community member page.
- Get it out. When it is still a future project, people will be less inclined to participate in remaining tasks. Finish critical stuff, worry about the rest later.
- PLIPs: we will do that on github.
- LDAP. Steve is our resident LDAP guru.
There is a ploneorg.core issue tracker: https://github.com/plone/ploneorg.core
Some content will be lost. This was decided earlier. Of course the product pages from PloneSoftwareCenter. We need to keep a copy of other pages somewhere, like the old World Plone Day pages. See if someone complains. We did not want to automate migrating the old mess.
All kinds of forms, for example for ordering marketing material, could be added. Currently, just mail the Plone board, they handle it. So not needed for now.
Profile is intended to be a canonical, non-judgemental list of things you have done for Plone. Might need input from Plone embassadors, as not everyone will for example answer questions on stackoverflow in English, but they may do cool things locally.
Team: Víctor, Paul, Gil, Christina, Fulvio.
PLOG Wednesday noon: backend
Our current backend stack. Plone Open Garden.
Timo Stollenwerk leads discussion about status of our current backend stack.
Gather a full report, what happens to the components we are using, put it together. Start discussion about what we can do about it.
[Note that this is not necessarily the vision of the Plone community. This is a gathering of thoughts of individuals present at the meeting. Not everyone will agree.]
- Python 2 and 3. Python 3 is pretty far away with our current stack. Several packages work on Python 3, even some of the 'scary' parts. We would need to solve other problems first.
- Zope 2 currently has no maintainer. Oh, Godefroid is release manager. People are still contributing, but not everything gets released. There are automated builds for testing. If we need something from Zope 2, we can talk to people, but we need to develop it ourselves. We use the latest Zope 2 release.
- Zope toolkit. On github. Version wise we are behind, which is not actually that bad. It is quite stable. Version updates seem to have mostly been for some reorganisation, and also getting rid of zope.app stuff. Various people still has commit rights, people do stuff when they need it. Tres Seaver and Marius Gedminas change stuff, avoiding bitrot and keeping tests running. No big plan. The Zope Foundation organisation is not really active.
- ZODB 3. We could move to 4. A Python 2 ZODB cannot be migrated to Python 3. But the package supports both.
- ZCatalog. David Glick and Hanno Schlichting worked on it. It needs Zope 2, so no one outside is using it. Technology is way beyond its limits, holding catalog data in zodb is over. ZCatalog is what makes ZODB unmarketable, becoming a bottle neck.
- CMF. Still on subversion on zope.org, which makes our continuous integration a bit unstable. Basically only one committer, who prefers not to switch to github. There are some ways to automatically sync subversion to github. We can talk to Tres Seaver as well. In CMF everything depends on everything, hard to separate. We are subclassing much of CMF. Essential is at least the portal_types tool and workflow. We are not actually using the latest release I think, which might remove the CMFDefault dependency, not sure. Custom code out there will reference core CMF stuff that is not really in the API, like underscore methods. We can at least try to rip out stuff and see what happens. Flatten the inheritance structure. Create an explorative team that takes a few hours to try things out and come up with an estimate.
- Archetypes and dexterity content types. Still supporting Archetypes, kind of, mostly some UI issues. Testing against both Archetypes and dexterity content types is tricky, brittle. If people use it and depend on it, it will be maintained. Suggest deprecating it in Plone 5.1, rip it out in 6? If people put effort in it, it is fine. If it is moved out of the core, it is dying. There is migration code for data. Plone community is smart, but we cannot handle all parts forever, we might need to drop stuff that is hardly getting used. If we want to rip it out in Plone 6, we should announce that officially. We need a roadmap for that. And find somebody who cares about Archetypes enough to keep an eye on it, that it remains working nicely. Please step up if you care, maybe we could still keep it in Plone then. Making it more smooth to move content types from Archetypes to dexterity, also in the actual Python code instead of just the data, would be very helpful. Make a script that can take an Archetypes schema and creates a dexterity schema from it, as a start. We are good at creating ideas, but bad at communicating what people would need to do, that they should move out of Archetypes. Publish it on plone.org, etcetera. Make a decision and communicate.
- Zope 4. Many options for the next step that we could take. It matters a lot where you want to go. ZODB and object model make Plone unique, flexible, very good for content management. So on long term: keep ZODB, do not move to relational database (well, RelStorage is fine). Storing json in postgres could be an option too. Writing object oriented, polymorphic code is sane with ZODB, including hierarchical data. Security on object level is awesome, not for only View permission, but other permissions, which other frameworks don't do. RestrictedPython that checks whether you have the right permissions is very nice. Some don't agree. Role based access control is great. Building products on top of Plone is very powerful. A lot of that is based on features in CMF, think of workflows, etc. But it would be good to get rid of all the layers on layers on layers. There is so much Plone. You currently need to be an expert developer to cope with this. Plone has been around for a long, long time, which is actually good. We have lots of knowledge, we need some new answers on which parts could and should really be removed, without irreparably throwing stuff away. Pyramid has the same concepts, making it very familiar, clean, without over designing things, with expressive APIs. It is not about whether we should use Pyramid, but about figuring out what the essence is that we should keep. And what small steps can be done now to make changes in the future easier. For Pyramid the right solution was to rip everything out and build the essence. This may not be the best approach for Plone. We need to stop building layers and layers and layers. We need a stronger core with the essence. The DemoStorage is really very helpful for testing, much harder to do right in relational databases. A cleanup is not very rewarding to those working on it, you need to know where you want to go, where you want to end up. Rip out stuff and move to the shiny stuff on the front-end. Think about how to cut away significant parts of the complexity. If Plone is reduced to a json store, should we want that, we can rip out much. If you know where you want to be in five years, it avoids the temptation to do intermediary stuff that you will rip out a little while later. Complex systems evolved from simple systems that already work. We cannot start from scratch. Plone has always been an innovator. Keep doing that. But also be smart to switch to the best of breed if another community has come up with something better.
- Move the discussion back to Zope 4. Several years ago Laurence did work, some lost during a move to github. It is a significant amount of work. Does it really get us where we want to be? Do we need to rebuild everything we currently have in Zope 4, or do we need to build something else, sidestepping the current code? If we have a RESTful API, getting a shiny front-end becomes easier. Makes mobile development much easier too. All web development frameworks face this problem. A full JSON api will not be there overnight. It will be hybrid. We need to keep working on our Javascript story. Pyramid has template renderers and JSON renderers, making for a clearer separation of data and presentation. We can just serve the JSON and have some more or less separate front-end show it. Plone is a CMS, not really a framework, so it differs a lot from Pyramid. Pyramid is more like Zope. In Pyramid you build your application from the inside out, with Pyramid just as a tool.
So likely directions from here:
- Reduce CMF to one layer within Plone.
- RESTful JSON api. This has the possibility of making a lot of the code irrelevant, so making it easier to remove. Makes it easier to use a different layer below for publishing.
There is no roadmap in the Javascript world, it can move everywhere, so we cannot know exactly what we would need to support or what we can count on there.
Do not forget the Python API in plone.api. Add-ons and higher level Plone layers should use this API. Still hard to get rid of the underlying stuff.
What you want in a JSON API is not a one on one rebuild of the Python API. More content centric: a JSON representation of the content.
More important to focus on where we want to be than on how we solve it. A vision is needed as a start. And more important to have a vision than to have the correct vision. Otherwise you go in circles. Do not plan on too many things that might possibly go wrong. Be bold. Do not loose vision based on what your customers want, which customers you may loose.
PLOG Wednesday morning talks
Talks, not exactly lightning, at Plone Open Garden Sorrento.
Paul: We want to have a rough roadmap at the end of the week, with something actionable. Now we start with talks.
Alex Ghica: Bucharest conference
The next big Plone conference will be in October this year, in Bucharest, capital of Romania. Five star hotel in the city center. High tower. By flight it is connected to lots of cities. Currently still cheap, about 40 euros from Amsterdam for a return flight. Sponsors are welcome. In about a month the conference tickets will be available, early bird about 360 euro.
Gil Forcada: WPOD
WPOD is: World Plone Office Day. Last Friday of the month, do not work on customer projects, but work on Plone. We should have entry-level tasks ready. Talk about it, so that people know this is happening. In each city, meet at a company's office, have a good, productive, enjoyable time with each other. Be available as mentor, team leader. Open a sprint channel on irc.
First one: April 24th.
Philip: I pledge that my company will do this every month, at least half a day, maybe a day. I challenge other companies to do the same. And be welcoming and helpful for newbies.
Guido Stevens: Plone Intranet
We have a Plone Intranet Consortium. Would that approach be doable for Plone core? It takes a long preparation, the first steps were started six years ago. It takes willingness to invest time and money. It is a business opportunity, asking investment and needing payment after a while by happy clients.
Demo of Mercury development version of Plone Intranet. Activity stream. Important sticky notes that remain visible until you click them. @mention people. Uploading for example a word document will convert it to PDF. Work spaces. Raptor as rich text editor. Wrapper around the sharing tab: three sliders to set your work space policy. Using experimental.securityindexing for permission indexing speedup. We want to do the Venus release before the summer. Workflow state transitions as milestones, use it for case management, fully flexible. Faceted SOLR search. Image bank. Render previews in the search. Personal page with personal stream. Find people by expertise. Follow people.
So that is the product. What is interesting is the process. Nine companies participate currently. Thousand euro per month, one developer day per week. Working from six or seven countries, sprinting remotely. Difficult to get traction, or on-boarding new developers. You need a few months to become productive. If you want to sell these solutions, the investment pays back. It is not just an add-on that you add to your Plone Site with a couple of others. First approach is to install it, change logo and other theming, do LDAP integration, upgrade, support it, make money that way. Second approach is to do your own design, and really think that through for a specific client. You can develop apps for it too.
Running on Plone 5 alpha 3. Dexterity content types. No z3c.form. Patternslib with the inject pattern. We keep accessibility in mind, it also works without Javascript. (Remember that accessibility is not the same as no Javascript. But some countries, at least the Netherlands, require non-javascript for certain sites.) Storing objects in btrees for performance. Lots of development goes into front-end code. Mobile-first design: working nicely on phones and tablets.
We develop it in a consortium, but the code is GPL and is donated to the Plone Foundation. Participating as a business is a risk, a possibility, and a lot of fun and opportunity to learn. If you want to participate, then make your decision. It should be a sound business decision.
It is one repository on github, but composed of multiple Python packages.
See the documentation: http://docs.ploneintranet.org
Ramon, Aleix: Machine learning for Plone content
Machine learning for text. Automatically learn from data, then predict for new content.
First step: from text to vectors. Second step: from vector to model.
Clustering: we automatically cluster documents together, so you get a list of related items, without needing to select them manually. It will allow us to create an automatic sitemap based on the clusters or topics.
Classification: text and tags. Predicting which tags would be appropriate for a newly added document.
We would love to have a Plone Machine Learning API.
- getLearningText(content)
- getLearningTags(content)
- getLearningRelation(content)
Computing the models: using scikit-learn. If you recompute at each new document, it will be slow.
Multi resolution time series, for continuously changing data, like number of visitors. Numbers instead of text. Lossy storage: less data, only the interesting info. Related work: RRDtool.
Ideas: generate taxonomy of tags.