This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# dR stats
This project is made to determine the health of the devRant developer community.
Also this data will be used for retoor9b, the newest AI hype! You're still using ChatGPT?
## Statistics by last build
Click here for latest [dataset](https://retoor.molodetz.nl/retoor/drstats/src/branch/main/export/0_dataset.txt).
Click here for latest [graphs compilaiton](https://retoor.molodetz.nl/retoor/drstats/src/branch/main/export/1_graphs_compliation.png).
Click here for all generated [data](https://retoor.molodetz.nl/retoor/drstats/src/branch/main/export). It's a big dataset containing data for LLM's to train on, graphs per user or overal statistics and json files with all made observations.
Statistics are build automatically using a build server.
Generating these statistics takes quite some steps. Look at the build log under the [actions](https://retoor.molodetz.nl/retoor/drstats/actions?workflow=export.yaml&actor=0&status=1) tab.
## Credits
Thanks to Rohan Burke (coolq). The creator of the dr api wrapper this project uses. Since it isn't made like a package, i had to copy his source files to my source folder. His library: https://github.com/coolq1000/devrant-python-api/
## Using this project
### Prepare environment
Create python3 environment:
```
python3 -m venv ./venv
```
Activate python3 environment:
```
source ./venv/bin/activate
```
### Make
You don't have to use more than make. If you just run `make` all statistics will be generated. It will execute the right apps for generating statistics.
### Applications
If you type `dr.` in terminal and press tab you'll see all available apps auto completed. These applications are also used by make.
1.`dr.sync` synchronizes all data from last two weeks from devrant. Only two weeks because it's rate limited.
2.`dr.dataset` exports all data to be used for LLM embedding., don't forget to execute `dr.sync` first.
3.`dr.stats_all` exports all graphs to export folder, don't forget to execute `dr.sync` first.
4.`dr.rant_stats_per_day` exports graphs to export folder. don't forget to execute `dr.sync` first.
5.`dr.rant_stats_per_hour` exports graphs to export folder. don't forget to execute `dr.sync` first.
6.`dr.rant_stats_per_weekday` exports graphs to export folder. don't forget to execute `dr.sync` first.
## Observations made by AI regarding statistics
The model used for generating these observations is called `smoll2` which is a 1.7b model.
Provided report below does contain some inconvenience but I'm working on it by testing several models. I am limited by the power my server provides for running LLM's. I do not own a decent GPU.
If I would attach the ChatGPT API to my project, the statistics would be better / perfect. I have tested this. Sadly, the API costs to much for a hobby project and I refuse the use of an credit card. There are better options for payment not provided by OpenAI which I prefer.
### Several trends and insights about the devRant community
1. The most active users seem to be posting more than once per month. This
could indicate that these individuals are very engaged with the community
or have a high level of interest in participating in discussions.
2. There is a large range in the post lengths, ranging from 19 characters
(kienkhongngu) to 742 characters (Pogromist). While there may be some
outliers due to formatting issues or other factors, this suggests that
users have varying levels of engagement and writing style on the forum.
3. The "ownership_content" value ranges from -0.5 to 1. This indicates
that while some users do not post much, others are heavily involved with
frequent and in-depth contributions. However, it's unclear what specific
metric this represents or how it correlates with user engagement.
4. The most common "upvotes" per month range from 0 to 9 (arekxv) and 21
(-1 for negative upvotes). This suggests that while users are posting
relatively often, there may be some variability in their level of
agreement with the content they're sharing or commenting on.
5. Overall, the data indicates a moderate level of engagement from users.
While there is no clear indication of highly active users dominating the
forum, the overall statistics suggest an engaged community where users
contribute regularly and interact with each other's posts.
### Detailed GPT information about a certain user
The username is anonymized. Same for the actual values.
#### Analysis of user **neo**
1.**Rank and Contributions**:
- **Rank**: 45th overall
- **Contributions**: 45 posts
2.**Ownership**:
- **Ownership**: 0.10, indicating that neo holds a slightly larger portion of the content ownership in the dataset.
3.**Upvotes**:
- **Upvotes**: 120 upvotes in total, suggesting that neo's content receives a notable level of recognition.
- **Upvotes Ownership**: 0.03, meaning that neo owns about 3% of all upvotes in the dataset.
- **Upvote Ratio**: 2.67, which implies a solid amount of engagement with neo’s posts.
4.**Post Length**:
- **Total Post Length**: 5,100 characters
- **Average Post Length**: 113 characters per post, suggesting relatively concise contributions.
##### Summary:
- **neo** is ranked 45th with 45 posts, indicating a modest level of contribution.
- Their **ownership** percentage (0.10) reflects a slightly larger share of content ownership.
- The **upvotes** received (120) and **upvote ratio** (2.67) suggest a notable level of engagement with their content.
- The **average post length** of 113 characters indicates concise contributions.