Our competition is dedicated to providing a harassment-free experience for everyone, regardless of gender, gender identity and expression, age, sexual orientation, disability, physical appearance, body size, race, ethnicity, nationality, religion, previous competition attendance or computing experience (or lack of any of the aforementioned). We do not tolerate harassment of competition participants in any form. Competition participants violating these rules may be sanctioned or expelled from the competition without a refund (if applicable) at the discretion of the competition organizers.
Q: What is DataSci '17?
- DataSci is an online competition aiming to support student-created data science projects. The competition will be open to all university students globally, and will involve the following topics: creating unique algorithms for analyzing data, visualizations, write-ups on derived insights, and more. Our submission platform will open March 1st, and will close on April 8th. Participants will work together in small teams to find undiscovered patterns and new meanings in datasets of their choice. General guidelines will be provided, but DataSci will primarily be about engaging the creativity of its participants.
Q: What should my submissions be about?
- Your submissions should primarily be a novel analysis of a dataset of your choice. You may choose how this analysis is accomplished, but it must be obtained through computational means. We have a set of questions that we will ask for each submission, which will include the following: "Which algorithms did you use?", "What did you discover from the data", and "What are the implications of your insights, or how can those insights be used?".
Q: Who is allowed to participate in this competition?
- Anybody who is a student at a university. We do not restrict our submissions based on geography. Though this competition is hosted in the United States, we welcome international participants.
Q: When are projects due, and are we allowed to work on projects before this date?
- Our submission portal will formally open for submissions between March 1st and April 8th. ALL submissions must be submitted between these dates for them to be considered for prizes. You are allowed to begin work on projects before these dates, but we ask that you do not reuse old projects/data-science projects as a submission (projects you created before you knew about this competition).
Q: Are we allowed to have teams, and, if so, how big can they be?
- You are allowed to have teams, and we limit team size to a max of 4. As a side note, everybody on your team MUST have individually signed up on the Typeform. We will disqualify entire teams if even a single member of the team has not signed up on the Typeform by the end of the competition.
Q: What's worth submitting?
- Pretty much anything. Modern scientific research has discouraged the publication of 'negative' results; as in, results which tell us what isn't the answer, as opposed to what is. On the other hand, we here at DataSci '17 believe that 'position' and 'negative' results from data analysis are both interesting and something we'd love to see.
Q: What are some examples of possible data science projects?
- "Is there a connection between the popularity of a business’s social media account and the vocabulary size of its posts?"
- "What specific brain regions ‘light up’ consistently when an fMRI is performed while a patient listens to a specific genre of music?"
- "Is there a relationship between how an exoplanet was discovered [imaging, radial velocity, etc], and specific characteristics of that exoplanet?"
- Needless to say, there are a LOT of possible projects you could do involving data analytics.
Q: What types of datasets are we allowed to use?
- We highly encourage the use of publicly available datasets, as competition staff and judges must be able to access the data used to build your submission. However, we will consider approving the use of private datasets on a case-by-case basis. You must be able to provide written documentation proving that each member of your team is authorized to use that dataset. Furthermore, the owner(s) of the private dataset must agree in writing to allow your project and its findings to become publicly available, as well as provide event staff access to the dataset during project judging. You must send your initial request for approval to firstname.lastname@example.org no later than 11:59 CST on March 12th, 2017. Your request must include the name of the dataset, contact information for the dataset's owner(s), and proof that each member of your team is authorized to access the dataset. Submission of a private dataset approval request does not guarantee that your dataset will be approved. Intentionally falsifying documents at any point in this process is grounds for immediate disqualification.
Q: Why do private datasets need approval in advance?
- The DataSci '17 Staff are dedicated to protecting the intellectual property of our competitors and their datasets. While we do not want to limit the creativity of our competitors, we must have complete documentation before approving a private dataset to protect everyone involved.
Q: If we do receive approval to use a private dataset, are we required to release the dataset to the public?
- No. However, the owner(s) of the dataset must agree to let DataSci '17 staff and judges access the dataset while judging your submission.
Q: Can we submit multiple times?
- Yes! We highly encourage this, but we will limit you/your team to a max of 3 submissions. Keep in mind, if a single one of your projects win an opt-in prize, all other projects from you/your team are taken out the running for other opt-in prizes. Winners of opt-in prizes will still be able to win the Popular Choice award.
Q: Will we be required to disclose our source code?
- Yes. This is done to protect against plagiarism and to make sure that your code is open-sourced. All code must be uploaded to an open-sourced Github repo.
Q: Can we use X language, Y API, and Z library?
- Absolutely. We do not especially care how you manage to come about your analysis. Our single restriction is that you are not allowed to use 3rd party data-analytics companies/services for your submission, all work done must by you/your team. Visualizations can be created using commercially available software.
Q: Is there a (linguistically speaking) language that you require submissions to be in?
- Yes, we require all submissions to be in English. While we will make an attempt to overlook typos/incorrect grammar, in light of the diversity of participants we're expecting, our judges will look favorably upon clear/correct writing. In the case that you require assistance in writing your analysis due to language barriers, please don't hesitate to post a message on the Discussion board; we will try our best to help you.
Q: How do you submit a project?
- Simply submit through Devpost. If you have any questions on this process, just post in the Discussion board and we'll get to it.
Q: Which questions do you have to answer during the submissions process?
- Which algorithms did you primarily use to analyze the data?
- Which datasets did you use?
- What did you discover from the data?
- What are the implications of your insights, or how can those insights be used?
- Link us to the Github repository containing all the code you used to analyze the dataset.
- Who is in your team? Fill this out even if you are working solo.
Q: How will the submissions be judged?
- We understand that the 'interesting-ness' of a data-science project often doesn't come from the data, but from the analysis writeup following it. In the same sense, your submission will be primarily judged on what you write in the questions we ask in the submission form and any visualizations you provide. Obviously, we will place some weight on the novelty of your dataset used and the subject of your project, but we will look closer at what you, as the participant, can convey what you learned from the data.
Q: Do we retain intellectual property of our own analysis once it's submitted?
Any further questions? Just post in the Discussions board and we'll get a response to you as soon as we can! Happy analyzing :)