Hey Developers,
With the release of InterSystems IRIS 2021.2 Preview and all-new LOAD DATA functionality, we'd like to put it to the test with the new DATASETS contest!
🏆 InterSystems Datasets Contest 🏆
Duration: December 27 - January 16, 2022
In prizes: $9,450

Prizes
1. Experts Nomination - a specially selected jury will determine winners:
🥇 1st place - $4,000
🥈 2nd place - $2,000
🥉 3rd place - $1,000
🌟 4-10th places - $100
2. Community winners - applications that will receive the most votes in total:
🥇 1st place - $1,000
🥈 2nd place - $500
🥉 3rd place - $250
If several participants score the same amount of votes, they all are considered winners, and the money prize is shared among the winners.
Who can participate?
Any Developer Community member, except for InterSystems employees (ISC contractors allowed). Create an account!
👥 Developers can team up to create a collaborative application. Allowed from 2 to 5 developers in one team.
Do not forget to highlight your team members in the README of your application – DC user profiles.
Contest Period
🛠 December 27 - January 9: Application development and registration phase.
✅ January 10 - 16: Voting period.
Note: Developers can improve their apps throughout the entire registration and voting period.
The topic
One of the most discussed problems with our previous programming contests is the lack of datasets. Every time you have a project idea about a particular subject area or industry, you need a related dataset, and part of the work with the contest is to find/prepare/load the dataset.
That’s why we decided to have a dataset contest! Let’s bring several helpful datasets to the InterSystems Community!
What are we looking for?
Present a repository that will load a dataset into the InterSystems IRIS namespace.
This could be done ideally with a ZPM package, and the data can be inside the package, or the package can have a method that loads data from the URL into the IRIS instance. Anyways your project, once installed, should bring a class (classes) and its data for it that are related to a particular topic, subject area, idea, industry, name it.
The project should suggest how to use data - SQL query, REST API, or both.
Visualization of the data is a plus. Both visualization and API (if any) can be delivered with another project, but it’s not mandatory.
We don’t limit you in the ways, how the data can be stored in the repository. E.g., this could be:
- Export of the global(-s) (preferably in XML than in GOF format).
- An SQL script that creates data
- An ObjectScript (or java, js, python, name it) that generates data in IRIS
- Integration with external Data API
Here are the project examples, that deliver the thing:
Requirements:
- Class naming convention. Start the class names with: dc.data.your_name.class. E.g. if there is a dataset on trading data the class names could be: dc.data.finance.transaction, dc.data.finance.instrument.
- The reference to the source of data. If you take the dataset from somewhere on the Internet and adapt it to InterSystems IRIS format, please provide the link to the source. If this is your data, please provide the license of the usage.
- The ZPM package should start with a“dataset-” name, e.g., dataset-countries, dataset-titanic.
- And as usual, we’ll have technical bonuses for docker, demo, article, zpm, video, etc.
- Provide the license to a dataset.
👉 Common license types for datasets (the source)
Common licenses in order of most open to most restrictive:
PUBLIC DOMAIN MARK - PUBLIC DOMAIN
Dedicate your dataset to the public domain: This isn’t technically a license since you are relinquishing all your rights in your dataset by choosing to dedicate your dataset to the public domain. To donate your work to the public domain, you can select “public domain” from the license menu when creating your dataset.
OPEN DATA COMMONS PUBLIC DOMAIN DEDICATION AND LICENSE - PDDL
This license is one of the Open Data Commons licenses and is like a public domain dedication. It allows you, as a dataset owner, to use a license mechanism to surrender your rights in a dataset when you might not otherwise be able to dedicate your dataset to the public domain under applicable law.
CREATIVE COMMONS ATTRIBUTION 4.0 INTERNATIONAL CC-BY
This license is one of the open Creative Commons licenses and allows users to share and adapt your dataset so long as they give credit to you.
COMMUNITY DATA LICENSE AGREEMENT – CDLA PERMISSIVE-2.0
This Community Data License Agreement is similar to permissive open source licenses such as the MIT license. It allows users to use, modify and adapt your dataset and the data within it, and to share it. The CDLA-Permissive-2.0 terms explicitly do not impose any obligations or restrictions on results obtained from users’ computational use of the data. The 2.0 version is significantly shorter, uses plain language to express the grant of permissions and requirements. The only obligation is to "make available the text of this agreement with the shared Data," including the disclaimer of warranties and liability.
OPEN DATA COMMONS ATTRIBUTION LICENSE - ODC-BY
This license is one of the Open Data Commons licenses and allows users to share and adapt your dataset so long as they give credit to you.
CREATIVE COMMONS ATTRIBUTION-SHAREALIKE 4.0 INTERNATIONAL - CC-BY-SA
This license is one of the open Creative Commons licenses and allows users to share and adapt your dataset so long as they give credit to you and distribute any additions, transformations or changes to your dataset under this license. We consider this license (a.k.a a viral license) problematic since others may decide not to work with your CC-BY-SA licensed dataset if there is risk that by doing so their work on your dataset will need to be shared under this license when they would rather use another license.
COMMUNITY DATA LICENSE AGREEMENT – CDLA-SHARING-1.0
This license is one of the Community Data License Agreement licenses and was designed to embody the principles of "copyleft" in a data license. It allows users to use, modify and adapt your dataset and the data within it, and to share the dataset and data with their changes so long as they do so under the CDLA-Sharing and give credit to you. The CDLA-Sharing terms explicitly do not impose any obligations or restrictions on results obtained from users’ computational use of the data.
OPEN DATA COMMONS OPEN DATABASE LICENSE - ODC-ODBL
This license is one of the Open Data Commons licenses and allows users to share and adapt your dataset so long as they give credit to you and distribute any additions, transformation or changes to your dataset under this license. We consider this license (a.k.a a viral license) problematic since others may decide not to work with your ODC-ODbL licensed dataset if there is risk that by doing so their work on your dataset will need to be shared under this license when they would rather use another license.
CREATIVE COMMONS ATTRIBUTION-NONCOMMERCIAL 4.0 INTERNATIONAL - CC BY-NC
This license is one of the more restrictive Creative Commons licenses. Users can share and adapt your dataset if they give credit to you and do not use your dataset for any commercial purposes.
CREATIVE COMMONS ATTRIBUTION-NODERIVATIVES 4.0 INTERNATIONAL - CC BY-ND
This license is one of the more restrictive Creative Commons licenses. Users can share your dataset if they give credit to you, but they cannot make any additions, transformations or changes to your dataset under this license.
CREATIVE COMMONS ATTRIBUTION-NONCOMMERCIAL-SHAREALIKE 4.0 INTERNATIONAL - CC BY-NC-SA
This license is one of the most restrictive Creative Commons licenses. Users can share your dataset only if they (1) give credit to you, (2) do not use your dataset for any commercial purposes, and (3) distribute any additions, transformations or changes to your dataset under this license. We consider this license a viral license since users will need to share their work on your dataset under this same license and any users of the adapted dataset would likewise need to share their work on the adapted dataset under this license and so on for any other changes to those modified datasets.
CREATIVE COMMONS ATTRIBUTION-NONCOMMERCIAL-NODERIVATIVES 4.0 INTERNATIONAL - CC BY-NC-ND
This license is one of the most restrictive Creative Commons licenses. Users can share only your unmodified dataset if they give credit to you and do not share it for commercial purposes. Users cannot make any additions, transformations or changes to your dataset under this license.
ADDITIONAL LICENSE COVERAGE OPTIONS
If a license is not listed in the data.world menu options, you may select Other and specify the details in the summary of your dataset.
NO LICENSE SPECIFIED
No one can use, share, distribute, re-post, add to, transform or change your dataset if you have not specified a license.
These descriptions are only summaries of these licenses. For the actual text of the licenses, which we strongly encourage you to read, click on the links provided.
Summary of common license types:
PUBLIC DOMAIN
The work has been dedicated to the public domain by waiving all rights to the work worldwide under copyright law, including all related and neighboring rights, to the extent allowed by law.
ATTRIBUTION
You must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use.
SHARE-ALIKE
If you remix, transform, or build upon the material, you must distribute your contributions under the same license as the original.
NON-COMMERCIAL
You may not use the material for commercial purposes.
DATABASE ONLY
License applies to the database only and not its contents or data.
NO DERIVATIVES
No Derivative Works. You may not alter, transform, or build upon this work.
All licenses that begin with CC-BY in the table above refer to version 4.0 of those licenses.
General Requirements:
- Accepted applications: new to Open Exchange apps or existing ones, but with a significant improvement. Our team will review all applications before approving them for the contest.
- The application should work either on IRIS Community Edition or IRIS for Health Community Edition or IRIS Advanced Analytics Community Edition.
- The application should be Open Source and published on GitHub.
- The README file to the application should be in English, contain the installation steps, and contain either the video demo or/and a description of how the application works.
Helpful resources
1. For beginners with InterSystems IRIS:
2. For beginners with ObjectScript Package Manager (ZPM):
3. How to submit your app to the contest:
4. And more:
Judgment
Voting rules will be announced soon. Stay tuned!
So!
We're waiting for YOUR project – join our coding marathon to win!
❗️ Please check out the Official Contest Terms here.❗️
It would be great to see a solution that made it easy to get data from open government systems using popular APIs such as Socrata and CKAN. For example, Cambridge, MA uses Socrata. And the US Government uses CKAN.
Another interesting area is ingesting Synthea data directly into IRIS!
Thanks for sharing, Raj!
Here are two examples using Synthea:
And two example repositories using
LOAD DATA
(but not yet ZPM-enabled):That's a great competition. I look forward to It.
Developers!
Only 2 days left and the InterSystems Datasets Contest begins!
Get ready to upload your applications!
Happy weekends!😊
The InterSystems Datasets Contest is only started and we already have the first competitor in the game!
Medical Dataset by @Muhammad Waseem
Check it out!
Upload your applications, we are waiting for your solutions!
A new competitor is in the game with 2 great applications!
Dataset OEX reviews by @Robert Cemper
Dataset Lightweight M:N by @Robert Cemper
Check them out!
And don't forget to join a new InterSystems Datasets Contest 🏆
Developers!
Happy new year and merry Xmas!
Have a great weekends, and happynes to everyone!
Wow! One more application on the Contest board:
openflights_dataset by @Andreas Schneider
We're also waiting for other participants and their cool apps!
Happy weekends!
Another application is in the competition!
iris-kaggle-socrata-generator by @Henrique Dias
Who is gonna be next? We are waiting for you!
Community!
Only 4 days left to register your application! Upload your app and join the competition!
And we have 2 more new apps by developers in the InterSystems Datasets Contest!
dataset-covid19-fake-news by @Henry Pereira
Health Dataset by @Yuri Marx
Developers!
One more application on the Contest board:
exchange-rate-cbrf by @Sergey Mikhailenko
Hurry up to upload your solutions! 😎
Hey Devs!
The registration period is will be over on Monday!
So don't forget to use Technology Bonuses to get extra points in the voting!
And by the way, we have another application in the game:
iris-python-faker by @Dmitry Maslennikov
We are waiting for your solutions, join the InterSystems Datasets Contest !
Happy weekends😊