80 Commits

Author SHA1 Message Date
Stepan Vladovskiy
a1a61a6731 feat: follow same logic as search shouts for authors. Store them to Reddis cache + pagination
All checks were successful
Deploy on push / deploy (push) Successful in 41s
2025-05-02 18:17:05 -03:00
Stepan Vladovskiy
8d6ad2c84f refactor(author.py): remove verbose loging in resolver level 2025-05-02 18:04:10 -03:00
Stepan Vladovskiy
beba1992e9 fix(__init.py__): clean name of resolver for authors search loading
All checks were successful
Deploy on push / deploy (push) Successful in 39s
2025-04-29 19:49:47 -03:00
Stepan Vladovskiy
b0296d7747 fix(__init.py__): added created resolver in resolvers lists
All checks were successful
Deploy on push / deploy (push) Successful in 41s
2025-04-29 19:40:20 -03:00
Stepan Vladovskiy
98e3dff35e fix(author.py): resolver load_authors_search error fix
All checks were successful
Deploy on push / deploy (push) Successful in 40s
2025-04-29 18:00:38 -03:00
Stepan Vladovskiy
3782a9dffb fix(search.py, author.py): small fixes for start. logger import fails
All checks were successful
Deploy on push / deploy (push) Successful in 40s
2025-04-29 17:50:51 -03:00
Stepan Vladovskiy
93c00b3dd1 feat(author.py):addresolver for searching authors by text
All checks were successful
Deploy on push / deploy (push) Successful in 1m15s
2025-04-29 17:45:37 -03:00
Stepan Vladovskiy
fac43e5997 refact(search,reader): withput any kind of sorting
All checks were successful
Deploy on push / deploy (push) Successful in 42s
2025-04-24 21:00:41 -03:00
Stepan Vladovskiy
e7facf8d87 style(search.py): with indexing message
All checks were successful
Deploy on push / deploy (push) Successful in 42s
2025-04-24 18:45:00 -03:00
Stepan Vladovskiy
3062a2b7de refactor(search.py): with checking titles without bodies for not re indexing them every startup
All checks were successful
Deploy on push / deploy (push) Successful in 42s
2025-04-24 14:58:14 -03:00
Stepan Vladovskiy
c0406dbbf2 refac(search.py): without logger and rm dublicated def search-text
All checks were successful
Deploy on push / deploy (push) Successful in 44s
2025-04-24 14:18:14 -03:00
Stepan Vladovskiy
ab4610575f refactor(reader.py): to handle search combined
All checks were successful
Deploy on push / deploy (push) Successful in 44s
2025-04-24 13:56:38 -03:00
Stepan Vladovskiy
5425dbf832 refactor(search.py): simplify def search 2025-04-24 13:46:58 -03:00
Stepan Vladovskiy
a10db2d38a feat(search.py): combined search on shouts tittles and bodys 2025-04-24 13:35:36 -03:00
Stepan Vladovskiy
83e70856cd debug(server.py): i dont know why, but it is appears and i am rm it
All checks were successful
Deploy on push / deploy (push) Successful in 41s
2025-04-23 18:32:58 -03:00
Stepan Vladovskiy
11654dba68 feat: with three separate endpoints
All checks were successful
Deploy on push / deploy (push) Successful in 5s
2025-04-23 18:24:00 -03:00
Stepan Vladovskiy
ec9465ad40 merge dev
All checks were successful
Deploy on push / deploy (push) Successful in 46s
2025-04-20 19:24:59 -03:00
Stepan Vladovskiy
4d965fb27b feat(search.py): separate indexing of Shout Title, shout Body and Authors
All checks were successful
Deploy on push / deploy (push) Successful in 39s
2025-04-20 19:22:08 -03:00
aaa6022a53 draft-create-fix4
All checks were successful
Deploy on push / deploy (push) Successful in 46s
2025-04-16 14:17:59 +03:00
d6ada44c7f draft-create-fix3
All checks were successful
Deploy on push / deploy (push) Successful in 48s
2025-04-16 11:51:19 +03:00
243f836f0a draft-create-fix2
All checks were successful
Deploy on push / deploy (push) Successful in 46s
2025-04-16 11:48:47 +03:00
536c094e72 draft-create-fix
All checks were successful
Deploy on push / deploy (push) Successful in 48s
2025-04-16 11:45:38 +03:00
Stepan Vladovskiy
e382cc1ea5 Merge branch 'dev' into feat/sv-searching-txtai
All checks were successful
Deploy on push / deploy (push) Successful in 6s
:
2025-04-15 19:20:48 -03:00
6920351b82 schema-fix
All checks were successful
Deploy on push / deploy (push) Successful in 52s
2025-04-15 20:30:12 +03:00
eb216a5f36 draft-seo-handling
All checks were successful
Deploy on push / deploy (push) Successful in 1m10s
2025-04-15 20:16:01 +03:00
bd129efde6 update-seo-handling 2025-04-15 20:14:42 +03:00
b9f6033e66 generate seo text when draft created 2025-04-15 20:09:22 +03:00
710f522c8f schema-upgrade
All checks were successful
Deploy on push / deploy (push) Successful in 47s
2025-04-14 19:53:14 +03:00
0de4404cb1 draft-community
All checks were successful
Deploy on push / deploy (push) Successful in 47s
2025-04-14 16:02:19 +03:00
to
83d61ca76d Merge branch 'dev' into feat/sv-searching-txtai
All checks were successful
Deploy on push / deploy (push) Successful in 6s
2025-04-13 05:36:18 +00:00
1c61e889d6 update-draft-fix
All checks were successful
Deploy on push / deploy (push) Successful in 47s
2025-04-10 22:51:07 +03:00
fdedb75a2c topics-comments-stat
All checks were successful
Deploy on push / deploy (push) Successful in 45s
2025-04-10 19:14:27 +03:00
f20000f1f6 topic.stat.authors-fix2
All checks were successful
Deploy on push / deploy (push) Successful in 45s
2025-04-10 18:46:09 +03:00
7d50638b3a topic.stat.authors-fix
All checks were successful
Deploy on push / deploy (push) Successful in 1m20s
2025-04-10 18:39:31 +03:00
Stepan Vladovskiy
106222b0e0 debug: without debug logging. clean
All checks were successful
Deploy on push / deploy (push) Successful in 1m27s
2025-04-07 11:41:48 -03:00
Stepan Vladovskiy
c533241d1e fix(reader): sorting by rang not by id in cash
All checks were successful
Deploy on push / deploy (push) Successful in 6s
2025-04-03 13:51:13 -03:00
Stepan Vladovskiy
78326047bf fix(reader.py): change sorting and answer on querys
All checks were successful
Deploy on push / deploy (push) Successful in 50s
2025-04-03 13:20:18 -03:00
Stepan Vladovskiy
bc4ec79240 fix(search.py): store all results in cash not only first offset
All checks were successful
Deploy on push / deploy (push) Successful in 52s
2025-04-03 13:10:53 -03:00
Stepan Vladovskiy
a0db5707c4 feat: add cash for storing searchresalts and hold them for working pagination. Now we are have offset for use on frontend
All checks were successful
Deploy on push / deploy (push) Successful in 51s
2025-04-01 16:01:09 -03:00
Stepan Vladovskiy
ecc443c3ad refactor(reader.py): Remove the unnecessary topic joins that cause duplicate results
All checks were successful
Deploy on push / deploy (push) Successful in 51s
2025-04-01 12:57:46 -03:00
Stepan Vladovskiy
9a02ca74ad merged with dev
All checks were successful
Deploy on push / deploy (push) Successful in 1m24s
2025-03-31 13:38:32 -03:00
Stepan Vladovskiy
9ebb81cbd3 refactor(reader.py): rm debug line 2025-03-31 13:32:51 -03:00
abbc074474 updateby-fix
All checks were successful
Deploy on push / deploy (push) Successful in 54s
2025-03-31 14:39:02 +03:00
Stepan Vladovskiy
0bc55977ac debug(reader.py): query_with_stat(info) always
All checks were successful
Deploy on push / deploy (push) Successful in 51s
2025-03-27 15:18:08 -03:00
Stepan Vladovskiy
ff3a4debce debug(reader.py): trying to handle main topic ids founded
All checks were successful
Deploy on push / deploy (push) Successful in 54s
2025-03-27 14:43:17 -03:00
Stepan Vladovskiy
ae85b32f69 feat(type.qraphql): SearchResult with shout id
All checks were successful
Deploy on push / deploy (push) Successful in 51s
2025-03-27 14:06:52 -03:00
Stepan Vladovskiy
34a354e9e3 debug(reader.py: trying back shout id in query call
All checks were successful
Deploy on push / deploy (push) Successful in 52s
2025-03-27 11:54:56 -03:00
4f599e097f [0.4.17] - 2025-03-26
All checks were successful
Deploy on push / deploy (push) Successful in 54s
- Fixed `'Reaction' object is not subscriptable` error in hierarchical comments:
  - Modified `get_reactions_with_stat()` to convert Reaction objects to dictionaries
  - Added default values for limit/offset parameters
  - Fixed `load_first_replies()` implementation with proper parameter passing
  - Added doctest with example usage
  - Limited child comments to 100 per parent for performance
2025-03-26 08:54:10 +03:00
a5eaf4bb65 commented->comments_count
All checks were successful
Deploy on push / deploy (push) Successful in 55s
2025-03-26 08:25:18 +03:00
Stepan Vladovskiy
e405fb527b refactor(search.py): moved to use one table docs for embdings and docs store
All checks were successful
Deploy on push / deploy (push) Successful in 50s
2025-03-25 16:42:44 -03:00
Stepan Vladovskiy
7f36f93d92 feat(search.py): detects both missing documents and null embeddings
All checks were successful
Deploy on push / deploy (push) Successful in 1m32s
2025-03-25 15:18:29 -03:00
Stepan Vladovskiy
f089a32394 debug(search.py): with more logs when check sync of indexing
All checks were successful
Deploy on push / deploy (push) Successful in 1m3s
2025-03-25 14:44:05 -03:00
Stepan Vladovskiy
1fd623a660 feat: with index sync endpoints configs
All checks were successful
Deploy on push / deploy (push) Successful in 56s
2025-03-25 13:31:45 -03:00
Stepan Vladovskiy
88012f1b8c debug(server.py): with 4 workers (threds). cheking reindexing
All checks were successful
Deploy on push / deploy (push) Successful in 55s
2025-03-25 12:21:59 -03:00
Stepan Vladovskiy
6e284640c0 feat: give little timeout for resource stab
All checks were successful
Deploy on push / deploy (push) Successful in 51s
2025-03-24 21:42:51 -03:00
Stepan Vladovskiy
077cb46482 debug: server.py -> threds 1 , search.py -> add 3 times reconect
All checks were successful
Deploy on push / deploy (push) Successful in 49s
2025-03-24 20:16:07 -03:00
Stepan Vladovskiy
60a13a9097 refactor(search.py): moved initialization logic in search-txtai instance
All checks were successful
Deploy on push / deploy (push) Successful in 55s
2025-03-24 19:47:02 -03:00
3c56fdfaea get_topics_paginated-fix
All checks were successful
Deploy on push / deploy (push) Successful in 56s
2025-03-22 18:49:15 +03:00
81a8bf3c58 query-type-fix
All checks were successful
Deploy on push / deploy (push) Successful in 49s
2025-03-22 18:44:31 +03:00
fe9984e2d8 type-fix
All checks were successful
Deploy on push / deploy (push) Successful in 43s
2025-03-22 18:39:14 +03:00
369ff757b0 [0.4.16] - 2025-03-22
All checks were successful
Deploy on push / deploy (push) Successful in 6s
- Added hierarchical comments pagination:
  - Created new GraphQL query `load_comments_branch` for efficient loading of hierarchical comments
  - Ability to load root comments with their first N replies
  - Added pagination for both root and child comments
  - Using existing `commented` field in `Stat` type to display number of replies
  - Added special `first_replies` field to store first replies to a comment
  - Optimized SQL queries for efficient loading of comment hierarchies
  - Implemented flexible comment sorting system (by time, rating)
2025-03-22 13:37:43 +03:00
Stepan Vladovskiy
316375bf18 debug(search.py): encrease batch size for bulk indexing
All checks were successful
Deploy on push / deploy (push) Successful in 1m1s
2025-03-21 17:56:54 -03:00
Stepan Vladovskiy
fb820f67fd debug(search.py): encrease batch size for bulk indexing
All checks were successful
Deploy on push / deploy (push) Successful in 53s
2025-03-21 17:48:26 -03:00
Stepan Vladovskiy
f1d9f4e036 feat(search.py): with db reset endpoint
All checks were successful
Deploy on push / deploy (push) Successful in 53s
2025-03-21 17:28:54 -03:00
Stepan Vladovskiy
ebb67eb311 debug: decrease chars in search.py for bulk indexing
All checks were successful
Deploy on push / deploy (push) Successful in 52s
2025-03-21 16:53:00 -03:00
Stepan Vladovskiy
50a8c24ead feat(search.py): documnet for bulk indexing are categorized
All checks were successful
Deploy on push / deploy (push) Successful in 55s
2025-03-21 15:40:29 -03:00
Stepan Vladovskiy
eb4b9363ab debug: change logs entris and indexing not wraps all in documents
All checks were successful
Deploy on push / deploy (push) Successful in 53s
2025-03-21 14:32:45 -03:00
Stepan Vladovskiy
19c5028a0c debug: Limit max chars for bulk indexing
All checks were successful
Deploy on push / deploy (push) Successful in 53s
2025-03-21 14:18:32 -03:00
Stepan Vladovskiy
57e1e8e6bd debug: more logs in indexing
All checks were successful
Deploy on push / deploy (push) Successful in 53s
2025-03-21 14:10:09 -03:00
Stepan Vladovskiy
385057ffcd debug: with logs in indexing procedure
All checks were successful
Deploy on push / deploy (push) Successful in 54s
2025-03-21 13:45:50 -03:00
Stepan Vladovskiy
90699768ff debug: start index
All checks were successful
Deploy on push / deploy (push) Successful in 55s
2025-03-21 13:30:23 -03:00
Stepan Vladovskiy
ad0ca75aa9 debug: no redis for indexing in nackend side
All checks were successful
Deploy on push / deploy (push) Successful in 1m41s
2025-03-19 14:47:31 -03:00
Stepan Vladovskiy
39242d5e6c debug: add logs in search.py and change and input validation ... index ver too
All checks were successful
Deploy on push / deploy (push) Successful in 55s
2025-03-12 14:13:55 -03:00
Stepan Vladovskiy
24cca7f2cb debug: something wrong one stap back with logs
All checks were successful
Deploy on push / deploy (push) Successful in 53s
2025-03-12 13:11:19 -03:00
Stepan Vladovskiy
a9c7ac49d6 feat: with logs >>>
All checks were successful
Deploy on push / deploy (push) Successful in 59s
2025-03-12 13:07:27 -03:00
Stepan Vladovskiy
f249752db5 feat: moved txtai and search procedure in different instance
All checks were successful
Deploy on push / deploy (push) Successful in 2m18s
2025-03-12 12:06:09 -03:00
Stepan Vladovskiy
c0b2116da2 feat(db.py): added fetch_all_shouts, to populate the search index
All checks were successful
Deploy on push / deploy (push) Successful in 35s
2025-03-05 20:32:34 +00:00
Stepan Vladovskiy
59e71c8144 debug: fixed workflows gitea
All checks were successful
Deploy on push / deploy (push) Successful in 4m41s
2025-03-05 20:17:34 +00:00
Stepan Vladovskiy
e6a416383d debug: fixed workflows gitea
All checks were successful
Deploy on push / deploy (push) Successful in 15s
2025-03-05 20:16:32 +00:00
Stepan Vladovskiy
d55448398d feat(search.py): change to txtai server, with ai model. And fix granian workers 2025-03-05 20:08:21 +00:00
24 changed files with 1727 additions and 331 deletions

View File

@@ -29,7 +29,16 @@ jobs:
if: github.ref == 'refs/heads/dev'
uses: dokku/github-action@master
with:
branch: 'dev'
branch: 'main'
force: true
git_remote_url: 'ssh://dokku@v2.discours.io:22/core'
ssh_private_key: ${{ secrets.SSH_PRIVATE_KEY }}
- name: Push to dokku for staging branch
if: github.ref == 'refs/heads/staging'
uses: dokku/github-action@master
with:
branch: 'dev'
git_remote_url: 'ssh://dokku@staging.discours.io:22/core'
ssh_private_key: ${{ secrets.SSH_PRIVATE_KEY }}
git_push_flags: '--force'

6
.gitignore vendored
View File

@@ -128,6 +128,9 @@ dmypy.json
.idea
temp.*
# Debug
DEBUG.log
discours.key
discours.crt
discours.pem
@@ -161,4 +164,5 @@ views.json
*.key
*.crt
*cache.json
.cursor
.cursor
.devcontainer/

View File

@@ -1,3 +1,33 @@
#### [0.4.19] - 2025-04-14
- dropped `Shout.description` and `Draft.description` to be UX-generated
- use redis to init views counters after migrator
#### [0.4.18] - 2025-04-10
- Fixed `Topic.stat.authors` and `Topic.stat.comments`
- Fixed unique constraint violation for empty slug values:
- Modified `update_draft` resolver to handle empty slug values
- Modified `create_draft` resolver to prevent empty slug values
- Added validation to prevent inserting or updating drafts with empty slug
- Fixed database error "duplicate key value violates unique constraint draft_slug_key"
#### [0.4.17] - 2025-03-26
- Fixed `'Reaction' object is not subscriptable` error in hierarchical comments:
- Modified `get_reactions_with_stat()` to convert Reaction objects to dictionaries
- Added default values for limit/offset parameters
- Fixed `load_first_replies()` implementation with proper parameter passing
- Added doctest with example usage
- Limited child comments to 100 per parent for performance
#### [0.4.16] - 2025-03-22
- Added hierarchical comments pagination:
- Created new GraphQL query `load_comments_branch` for efficient loading of hierarchical comments
- Ability to load root comments with their first N replies
- Added pagination for both root and child comments
- Using existing `comments_count` field in `Stat` type to display number of replies
- Added special `first_replies` field to store first replies to a comment
- Optimized SQL queries for efficient loading of comment hierarchies
- Implemented flexible comment sorting system (by time, rating)
#### [0.4.15] - 2025-03-22
- Upgraded caching system described `docs/caching.md`
- Module `cache/memorycache.py` removed
@@ -31,8 +61,7 @@
- Implemented persistent Redis caching for author queries without TTL (invalidated only on changes)
- Optimized author retrieval with separate endpoints:
- `get_authors_all` - returns all non-deleted authors without statistics
- `get_authors_paginated` - returns authors with statistics and pagination support
- `load_authors_by` - optimized to use caching and efficient sorting
- `load_authors_by` - optimized to use caching and efficient sorting and pagination
- Improved SQL queries with optimized JOIN conditions and efficient filtering
- Added pre-aggregation of statistics (shouts count, followers count) in single efficient queries
- Implemented robust cache invalidation on author updates
@@ -44,7 +73,6 @@
- Implemented persistent Redis caching for topic queries (no TTL, invalidated only on changes)
- Optimized topic retrieval with separate endpoints for different use cases:
- `get_topics_all` - returns all topics without statistics for lightweight listing
- `get_topics_paginated` - returns topics with statistics and pagination support
- `get_topics_by_community` - adds pagination and optimized filtering by community
- Added SQLAlchemy-managed indexes directly in ORM models for automatic schema maintenance
- Created `sync_indexes()` function for automatic index synchronization during app startup
@@ -142,7 +170,7 @@
#### [0.4.4]
- `followers_stat` removed for shout
- sqlite3 support added
- `rating_stat` and `commented_stat` fixes
- `rating_stat` and `comments_count` fixes
#### [0.4.3]
- cache reimplemented

1
app/resolvers/draft.py Normal file
View File

@@ -0,0 +1 @@

5
cache/cache.py vendored
View File

@@ -545,8 +545,9 @@ async def get_cached_data(key: str) -> Optional[Any]:
try:
cached_data = await redis.execute("GET", key)
if cached_data:
logger.debug(f"Данные получены из кеша по ключу {key}")
return orjson.loads(cached_data)
loaded = orjson.loads(cached_data)
logger.debug(f"Данные получены из кеша по ключу {key}: {len(loaded)}")
return loaded
return None
except Exception as e:
logger.error(f"Ошибка при получении данных из кеша: {e}")

165
docs/comments-pagination.md Normal file
View File

@@ -0,0 +1,165 @@
# Пагинация комментариев
## Обзор
Реализована система пагинации комментариев по веткам, которая позволяет эффективно загружать и отображать вложенные ветки обсуждений. Основные преимущества:
1. Загрузка только необходимых комментариев, а не всего дерева
2. Снижение нагрузки на сервер и клиент
3. Возможность эффективной навигации по большим обсуждениям
4. Предзагрузка первых N ответов для улучшения UX
## API для иерархической загрузки комментариев
### GraphQL запрос `load_comments_branch`
```graphql
query LoadCommentsBranch(
$shout: Int!,
$parentId: Int,
$limit: Int,
$offset: Int,
$sort: ReactionSort,
$childrenLimit: Int,
$childrenOffset: Int
) {
load_comments_branch(
shout: $shout,
parent_id: $parentId,
limit: $limit,
offset: $offset,
sort: $sort,
children_limit: $childrenLimit,
children_offset: $childrenOffset
) {
id
body
created_at
created_by {
id
name
slug
pic
}
kind
reply_to
stat {
rating
comments_count
}
first_replies {
id
body
created_at
created_by {
id
name
slug
pic
}
kind
reply_to
stat {
rating
comments_count
}
}
}
}
```
### Параметры запроса
| Параметр | Тип | По умолчанию | Описание |
|----------|-----|--------------|----------|
| shout | Int! | - | ID статьи, к которой относятся комментарии |
| parent_id | Int | null | ID родительского комментария. Если null, загружаются корневые комментарии |
| limit | Int | 10 | Максимальное количество комментариев для загрузки |
| offset | Int | 0 | Смещение для пагинации |
| sort | ReactionSort | newest | Порядок сортировки: newest, oldest, like |
| children_limit | Int | 3 | Максимальное количество дочерних комментариев для каждого родительского |
| children_offset | Int | 0 | Смещение для пагинации дочерних комментариев |
### Поля в ответе
Каждый комментарий содержит следующие основные поля:
- `id`: ID комментария
- `body`: Текст комментария
- `created_at`: Время создания
- `created_by`: Информация об авторе
- `kind`: Тип реакции (COMMENT)
- `reply_to`: ID родительского комментария (null для корневых)
- `first_replies`: Первые N дочерних комментариев
- `stat`: Статистика комментария, включающая:
- `comments_count`: Количество ответов на комментарий
- `rating`: Рейтинг комментария
## Примеры использования
### Загрузка корневых комментариев с первыми ответами
```javascript
const { data } = await client.query({
query: LOAD_COMMENTS_BRANCH,
variables: {
shout: 222,
limit: 10,
offset: 0,
sort: "newest",
childrenLimit: 3
}
});
```
### Загрузка ответов на конкретный комментарий
```javascript
const { data } = await client.query({
query: LOAD_COMMENTS_BRANCH,
variables: {
shout: 222,
parentId: 123, // ID комментария, для которого загружаем ответы
limit: 10,
offset: 0,
sort: "oldest" // Сортируем ответы от старых к новым
}
});
```
### Пагинация дочерних комментариев
Для загрузки дополнительных ответов на комментарий:
```javascript
const { data } = await client.query({
query: LOAD_COMMENTS_BRANCH,
variables: {
shout: 222,
parentId: 123,
limit: 10,
offset: 0,
childrenLimit: 5,
childrenOffset: 3 // Пропускаем первые 3 комментария (уже загруженные)
}
});
```
## Рекомендации по клиентской реализации
1. Для эффективной работы со сложными ветками обсуждений рекомендуется:
- Сначала загружать только корневые комментарии с первыми N ответами
- При наличии дополнительных ответов (когда `stat.comments_count > first_replies.length`)
добавить кнопку "Показать все ответы"
- При нажатии на кнопку загружать дополнительные ответы с помощью запроса с указанным `parentId`
2. Для сортировки:
- По умолчанию использовать `newest` для отображения свежих обсуждений
- Предусмотреть переключатель сортировки для всего дерева комментариев
- При изменении сортировки перезагружать данные с новым параметром `sort`
3. Для улучшения производительности:
- Кешировать результаты запросов на клиенте
- Использовать оптимистичные обновления при добавлении/редактировании комментариев
- При необходимости загружать комментарии порциями (ленивая загрузка)

View File

@@ -34,4 +34,15 @@
- Поддерживаемые методы: GET, POST, OPTIONS
- Настроена поддержка credentials
- Разрешенные заголовки: Authorization, Content-Type, X-Requested-With, DNT, Cache-Control
- Настроено кэширование preflight-ответов на 20 дней (1728000 секунд)
- Настроено кэширование preflight-ответов на 20 дней (1728000 секунд)
## Пагинация комментариев по веткам
- Эффективная загрузка комментариев с учетом их иерархической структуры
- Отдельный запрос `load_comments_branch` для оптимизированной загрузки ветки комментариев
- Возможность загрузки корневых комментариев статьи с первыми ответами на них
- Гибкая пагинация как для корневых, так и для дочерних комментариев
- Использование поля `stat.comments_count` для отображения количества ответов на комментарий
- Добавление специального поля `first_replies` для хранения первых ответов на комментарий
- Поддержка различных методов сортировки (новые, старые, популярные)
- Оптимизированные SQL запросы для минимизации нагрузки на базу данных

48
main.py
View File

@@ -17,7 +17,8 @@ from cache.revalidator import revalidation_manager
from services.exception import ExceptionHandlerMiddleware
from services.redis import redis
from services.schema import create_all_tables, resolvers
from services.search import search_service
#from services.search import search_service
from services.search import search_service, initialize_search_index
from services.viewed import ViewedStorage
from services.webhook import WebhookEndpoint, create_webhook_endpoint
from settings import DEV_SERVER_PID_FILE_NAME, MODE
@@ -34,24 +35,67 @@ async def start():
f.write(str(os.getpid()))
print(f"[main] process started in {MODE} mode")
async def check_search_service():
"""Check if search service is available and log result"""
info = await search_service.info()
if info.get("status") in ["error", "unavailable"]:
print(f"[WARNING] Search service unavailable: {info.get('message', 'unknown reason')}")
else:
print(f"[INFO] Search service is available: {info}")
# indexing DB data
# async def indexing():
# from services.db import fetch_all_shouts
# all_shouts = await fetch_all_shouts()
# await initialize_search_index(all_shouts)
async def lifespan(_app):
try:
print("[lifespan] Starting application initialization")
create_all_tables()
await asyncio.gather(
redis.connect(),
precache_data(),
ViewedStorage.init(),
create_webhook_endpoint(),
search_service.info(),
check_search_service(),
start(),
revalidation_manager.start(),
)
print("[lifespan] Basic initialization complete")
# Add a delay before starting the intensive search indexing
print("[lifespan] Waiting for system stabilization before search indexing...")
await asyncio.sleep(10) # 10-second delay to let the system stabilize
# Start search indexing as a background task with lower priority
asyncio.create_task(initialize_search_index_background())
yield
finally:
print("[lifespan] Shutting down application services")
tasks = [redis.disconnect(), ViewedStorage.stop(), revalidation_manager.stop()]
await asyncio.gather(*tasks, return_exceptions=True)
print("[lifespan] Shutdown complete")
# Initialize search index in the background
async def initialize_search_index_background():
"""Run search indexing as a background task with low priority"""
try:
print("[search] Starting background search indexing process")
from services.db import fetch_all_shouts
# Get total count first (optional)
all_shouts = await fetch_all_shouts()
total_count = len(all_shouts) if all_shouts else 0
print(f"[search] Fetched {total_count} shouts for background indexing")
# Start the indexing process with the fetched shouts
print("[search] Beginning background search index initialization...")
await initialize_search_index(all_shouts)
print("[search] Background search index initialization complete")
except Exception as e:
print(f"[search] Error in background search indexing: {str(e)}")
# Создаем экземпляр GraphQL
graphql_app = GraphQL(schema, debug=True)

View File

@@ -31,6 +31,7 @@ class Draft(Base):
# required
created_at: int = Column(Integer, nullable=False, default=lambda: int(time.time()))
created_by: int = Column(ForeignKey("author.id"), nullable=False)
community: int = Column(ForeignKey("community.id"), nullable=False, default=1)
# optional
layout: str = Column(String, nullable=True, default="article")
@@ -38,7 +39,6 @@ class Draft(Base):
title: str = Column(String, nullable=True)
subtitle: str | None = Column(String, nullable=True)
lead: str | None = Column(String, nullable=True)
description: str | None = Column(String, nullable=True)
body: str = Column(String, nullable=False, comment="Body")
media: dict | None = Column(JSON, nullable=True)
cover: str | None = Column(String, nullable=True, comment="Cover image url")

View File

@@ -71,6 +71,34 @@ class ShoutAuthor(Base):
class Shout(Base):
"""
Публикация в системе.
Attributes:
body (str)
slug (str)
cover (str) : "Cover image url"
cover_caption (str) : "Cover image alt caption"
lead (str)
title (str)
subtitle (str)
layout (str)
media (dict)
authors (list[Author])
topics (list[Topic])
reactions (list[Reaction])
lang (str)
version_of (int)
oid (str)
seo (str) : JSON
draft (int)
created_at (int)
updated_at (int)
published_at (int)
featured_at (int)
deleted_at (int)
created_by (int)
updated_by (int)
deleted_by (int)
community (int)
"""
__tablename__ = "shout"
@@ -91,7 +119,6 @@ class Shout(Base):
cover: str | None = Column(String, nullable=True, comment="Cover image url")
cover_caption: str | None = Column(String, nullable=True, comment="Cover image alt caption")
lead: str | None = Column(String, nullable=True)
description: str | None = Column(String, nullable=True)
title: str = Column(String, nullable=False)
subtitle: str | None = Column(String, nullable=True)
layout: str = Column(String, nullable=False, default="article")

View File

@@ -13,5 +13,10 @@ starlette
gql
ariadne
granian
# NLP and search
httpx
orjson
pydantic
pydantic
trafilatura

View File

@@ -8,6 +8,7 @@ from resolvers.author import ( # search_authors,
get_author_id,
get_authors_all,
load_authors_by,
load_authors_search,
update_author,
)
from resolvers.community import get_communities_all, get_community
@@ -37,6 +38,7 @@ from resolvers.reaction import (
create_reaction,
delete_reaction,
load_comment_ratings,
load_comments_branch,
load_reactions_by,
load_shout_comments,
load_shout_ratings,
@@ -70,6 +72,7 @@ __all__ = [
"get_author_follows_authors",
"get_authors_all",
"load_authors_by",
"load_authors_search",
"update_author",
## "search_authors",
# community
@@ -107,6 +110,7 @@ __all__ = [
"load_shout_comments",
"load_shout_ratings",
"load_comment_ratings",
"load_comments_branch",
# notifier
"load_notifications",
"notifications_seen_thread",

View File

@@ -20,6 +20,7 @@ from services.auth import login_required
from services.db import local_session
from services.redis import redis
from services.schema import mutation, query
from services.search import search_service
from utils.logger import root_logger as logger
DEFAULT_COMMUNITIES = [1]
@@ -232,22 +233,6 @@ async def get_authors_all(_, _info):
return await get_all_authors()
@query.field("get_authors_paginated")
async def get_authors_paginated(_, _info, limit=50, offset=0, by=None):
"""
Получает список авторов с пагинацией и статистикой.
Args:
limit: Максимальное количество возвращаемых авторов
offset: Смещение для пагинации
by: Параметр сортировки (new/active)
Returns:
list: Список авторов с их статистикой
"""
return await get_authors_with_stats(limit, offset, by)
@query.field("get_author")
async def get_author(_, _info, slug="", author_id=0):
author_dict = None
@@ -317,6 +302,46 @@ async def load_authors_by(_, _info, by, limit, offset):
return await get_authors_with_stats(limit, offset, by)
@query.field("load_authors_search")
async def load_authors_search(_, info, text: str, limit: int = 10, offset: int = 0):
"""
Resolver for searching authors by text. Works with txt-ai search endpony.
Args:
text: Search text
limit: Maximum number of authors to return
offset: Offset for pagination
Returns:
list: List of authors matching the search criteria
"""
# Get author IDs from search engine (already sorted by relevance)
search_results = await search_service.search_authors(text, limit, offset)
if not search_results:
return []
author_ids = [result.get("id") for result in search_results if result.get("id")]
if not author_ids:
return []
# Fetch full author objects from DB
with local_session() as session:
# Simple query to get authors by IDs - no need for stats here
authors_query = select(Author).filter(Author.id.in_(author_ids))
db_authors = session.execute(authors_query).scalars().all()
if not db_authors:
return []
# Create a dictionary for quick lookup
authors_dict = {str(author.id): author for author in db_authors}
# Keep the order from search results (maintains the relevance sorting)
ordered_authors = [authors_dict[author_id] for author_id in author_ids if author_id in authors_dict]
return ordered_authors
def get_author_id_from(slug="", user=None, author_id=None):
try:
author_id = None

View File

@@ -1,6 +1,7 @@
import time
from operator import or_
import trafilatura
from sqlalchemy.sql import and_
from cache.cache import (
@@ -30,7 +31,6 @@ def create_shout_from_draft(session, draft, author_id):
cover=draft.cover,
cover_caption=draft.cover_caption,
lead=draft.lead,
description=draft.description,
title=draft.title,
subtitle=draft.subtitle,
layout=draft.layout,
@@ -105,12 +105,17 @@ async def create_draft(_, info, draft_input):
if "title" not in draft_input or not draft_input["title"]:
draft_input["title"] = "" # Пустая строка вместо NULL
# Проверяем slug - он должен быть или не пустым, или не передаваться вообще
if "slug" in draft_input and (draft_input["slug"] is None or draft_input["slug"] == ""):
# При создании черновика удаляем пустой slug из входных данных
del draft_input["slug"]
try:
with local_session() as session:
# Remove id from input if present since it's auto-generated
if "id" in draft_input:
del draft_input["id"]
# Добавляем текущее время создания
draft_input["created_at"] = int(time.time())
@@ -122,6 +127,11 @@ async def create_draft(_, info, draft_input):
logger.error(f"Failed to create draft: {e}", exc_info=True)
return {"error": f"Failed to create draft: {str(e)}"}
def generate_teaser(body, limit=300):
body_text = trafilatura.extract(body, include_comments=False, include_tables=False)
body_teaser = ". ".join(body_text[:limit].split(". ")[:-1])
return body_teaser
@mutation.field("update_draft")
@login_required
@@ -142,14 +152,65 @@ async def update_draft(_, info, draft_id: int, draft_input):
if not user_id or not author_id:
return {"error": "Author ID are required"}
# Проверяем slug - он должен быть или не пустым, или не передаваться вообще
if "slug" in draft_input and (draft_input["slug"] is None or draft_input["slug"] == ""):
# Если slug пустой, либо удаляем его из входных данных, либо генерируем временный уникальный
# Вариант 1: просто удаляем ключ из входных данных, чтобы оставить старое значение
del draft_input["slug"]
# Вариант 2 (если нужно обновить): генерируем временный уникальный slug
# import uuid
# draft_input["slug"] = f"draft-{uuid.uuid4().hex[:8]}"
with local_session() as session:
draft = session.query(Draft).filter(Draft.id == draft_id).first()
if not draft:
return {"error": "Draft not found"}
# Generate SEO description if not provided and not already set
if "seo" not in draft_input and not draft.seo:
body_src = draft_input.get("body") if "body" in draft_input else draft.body
lead_src = draft_input.get("lead") if "lead" in draft_input else draft.lead
body_text = None
if body_src:
try:
# Extract text, excluding comments and tables
body_text = trafilatura.extract(body_src, include_comments=False, include_tables=False)
except Exception as e:
logger.warning(f"Trafilatura failed to extract body text for draft {draft_id}: {e}")
lead_text = None
if lead_src:
try:
# Extract text from lead
lead_text = trafilatura.extract(lead_src, include_comments=False, include_tables=False)
except Exception as e:
logger.warning(f"Trafilatura failed to extract lead text for draft {draft_id}: {e}")
# Generate body teaser only if body_text was successfully extracted
body_teaser = generate_teaser(body_text, 300) if body_text else ""
# Prioritize lead_text for SEO, fallback to body_teaser. Ensure it's a string.
generated_seo = lead_text if lead_text else body_teaser
draft_input["seo"] = generated_seo if generated_seo else ""
# Update the draft object with new data from draft_input
# Assuming Draft.update is a helper that iterates keys or similar.
# A more standard SQLAlchemy approach would be:
# for key, value in draft_input.items():
# if hasattr(draft, key):
# setattr(draft, key, value)
# But we stick to the existing pattern for now.
Draft.update(draft, draft_input)
draft.updated_at = int(time.time())
# Set updated timestamp and author
current_time = int(time.time())
draft.updated_at = current_time
draft.updated_by = author_id # Assuming author_id is correctly fetched context
session.commit()
# Invalidate cache related to this draft if necessary (consider adding)
# await invalidate_draft_cache(draft_id)
return {"draft": draft}
@@ -249,7 +310,6 @@ async def publish_shout(_, info, shout_id: int):
shout.cover = draft.cover
shout.cover_caption = draft.cover_caption
shout.lead = draft.lead
shout.description = draft.description
shout.layout = draft.layout
shout.media = draft.media
shout.lang = draft.lang

View File

@@ -1,6 +1,7 @@
import time
import orjson
import trafilatura
from sqlalchemy import and_, desc, select
from sqlalchemy.orm import joinedload
from sqlalchemy.sql.functions import coalesce
@@ -176,9 +177,16 @@ async def create_shout(_, info, inp):
logger.info(f"Creating shout with input: {inp}")
# Создаем публикацию без topics
body = inp.get("body", "")
lead = inp.get("lead", "")
body_text = trafilatura.extract(body)
lead_text = trafilatura.extract(lead)
seo = inp.get("seo", lead_text or body_text[:300].split(". ")[:-1].join(". "))
new_shout = Shout(
slug=slug,
body=inp.get("body", ""),
body=body,
seo=seo,
lead=lead,
layout=inp.get("layout", "article"),
title=inp.get("title", ""),
created_by=author_id,
@@ -380,7 +388,7 @@ def patch_topics(session, shout, topics_input):
# @login_required
async def update_shout(_, info, shout_id: int, shout_input=None, publish=False):
logger.info(f"Starting update_shout with id={shout_id}, publish={publish}")
logger.debug(f"Full shout_input: {shout_input}")
logger.debug(f"Full shout_input: {shout_input}") # DraftInput
user_id = info.context.get("user_id")
roles = info.context.get("roles", [])

View File

@@ -67,30 +67,35 @@ def add_reaction_stat_columns(q):
return q
def get_reactions_with_stat(q, limit, offset):
def get_reactions_with_stat(q, limit=10, offset=0):
"""
Execute the reaction query and retrieve reactions with statistics.
:param q: Query with reactions and statistics.
:param limit: Number of reactions to load.
:param offset: Pagination offset.
:return: List of reactions.
:return: List of reactions as dictionaries.
>>> get_reactions_with_stat(q, 10, 0) # doctest: +SKIP
[{'id': 1, 'body': 'Текст комментария', 'stat': {'rating': 5, 'comments_count': 3}, ...}]
"""
q = q.limit(limit).offset(offset)
reactions = []
with local_session() as session:
result_rows = session.execute(q)
for reaction, author, shout, commented_stat, rating_stat in result_rows:
for reaction, author, shout, comments_count, rating_stat in result_rows:
# Пропускаем реакции с отсутствующими shout или author
if not shout or not author:
logger.error(f"Пропущена реакция из-за отсутствия shout или author: {reaction.dict()}")
continue
reaction.created_by = author.dict()
reaction.shout = shout.dict()
reaction.stat = {"rating": rating_stat, "comments": commented_stat}
reactions.append(reaction)
# Преобразуем Reaction в словарь для доступа по ключу
reaction_dict = reaction.dict()
reaction_dict["created_by"] = author.dict()
reaction_dict["shout"] = shout.dict()
reaction_dict["stat"] = {"rating": rating_stat, "comments_count": comments_count}
reactions.append(reaction_dict)
return reactions
@@ -393,7 +398,7 @@ async def update_reaction(_, info, reaction):
result = session.execute(reaction_query).unique().first()
if result:
r, author, _shout, commented_stat, rating_stat = result
r, author, _shout, comments_count, rating_stat = result
if not r or not author:
return {"error": "Invalid reaction ID or unauthorized"}
@@ -408,7 +413,7 @@ async def update_reaction(_, info, reaction):
session.commit()
r.stat = {
"commented": commented_stat,
"comments_count": comments_count,
"rating": rating_stat,
}
@@ -612,24 +617,22 @@ async def load_shout_comments(_, info, shout: int, limit=50, offset=0):
@query.field("load_comment_ratings")
async def load_comment_ratings(_, info, comment: int, limit=50, offset=0):
"""
Load ratings for a specified comment with pagination and statistics.
Load ratings for a specified comment with pagination.
:param info: GraphQL context info.
:param comment: Comment ID.
:param limit: Number of ratings to load.
:param offset: Pagination offset.
:return: List of reactions.
:return: List of ratings.
"""
q = query_reactions()
q = add_reaction_stat_columns(q)
# Filter, group, sort, limit, offset
q = q.filter(
and_(
Reaction.deleted_at.is_(None),
Reaction.reply_to == comment,
Reaction.kind == ReactionKind.COMMENT.value,
Reaction.kind.in_(RATING_REACTIONS),
)
)
q = q.group_by(Reaction.id, Author.id, Shout.id)
@@ -637,3 +640,187 @@ async def load_comment_ratings(_, info, comment: int, limit=50, offset=0):
# Retrieve and return reactions
return get_reactions_with_stat(q, limit, offset)
@query.field("load_comments_branch")
async def load_comments_branch(
_,
_info,
shout: int,
parent_id: int | None = None,
limit=10,
offset=0,
sort="newest",
children_limit=3,
children_offset=0,
):
"""
Загружает иерархические комментарии с возможностью пагинации корневых и дочерних.
:param info: GraphQL context info.
:param shout: ID статьи.
:param parent_id: ID родительского комментария (None для корневых).
:param limit: Количество комментариев для загрузки.
:param offset: Смещение для пагинации.
:param sort: Порядок сортировки ('newest', 'oldest', 'like').
:param children_limit: Максимальное количество дочерних комментариев.
:param children_offset: Смещение для дочерних комментариев.
:return: Список комментариев с дочерними.
"""
# Создаем базовый запрос
q = query_reactions()
q = add_reaction_stat_columns(q)
# Фильтруем по статье и типу (комментарии)
q = q.filter(
and_(
Reaction.deleted_at.is_(None),
Reaction.shout == shout,
Reaction.kind == ReactionKind.COMMENT.value,
)
)
# Фильтруем по родительскому ID
if parent_id is None:
# Загружаем только корневые комментарии
q = q.filter(Reaction.reply_to.is_(None))
else:
# Загружаем только прямые ответы на указанный комментарий
q = q.filter(Reaction.reply_to == parent_id)
# Сортировка и группировка
q = q.group_by(Reaction.id, Author.id, Shout.id)
# Определяем сортировку
order_by_stmt = None
if sort.lower() == "oldest":
order_by_stmt = asc(Reaction.created_at)
elif sort.lower() == "like":
order_by_stmt = desc("rating_stat")
else: # "newest" по умолчанию
order_by_stmt = desc(Reaction.created_at)
q = q.order_by(order_by_stmt)
# Выполняем запрос для получения комментариев
comments = get_reactions_with_stat(q, limit, offset)
# Если комментарии найдены, загружаем дочерние и количество ответов
if comments:
# Загружаем количество ответов для каждого комментария
await load_replies_count(comments)
# Загружаем дочерние комментарии
await load_first_replies(comments, children_limit, children_offset, sort)
return comments
async def load_replies_count(comments):
"""
Загружает количество ответов для списка комментариев и обновляет поле stat.comments_count.
:param comments: Список комментариев, для которых нужно загрузить количество ответов.
"""
if not comments:
return
comment_ids = [comment["id"] for comment in comments]
# Запрос для подсчета количества ответов
q = (
select(Reaction.reply_to.label("parent_id"), func.count().label("count"))
.where(
and_(
Reaction.reply_to.in_(comment_ids),
Reaction.deleted_at.is_(None),
Reaction.kind == ReactionKind.COMMENT.value,
)
)
.group_by(Reaction.reply_to)
)
# Выполняем запрос
with local_session() as session:
result = session.execute(q).fetchall()
# Создаем словарь {parent_id: count}
replies_count = {row[0]: row[1] for row in result}
# Добавляем значения в комментарии
for comment in comments:
if "stat" not in comment:
comment["stat"] = {}
# Обновляем счетчик комментариев в stat
comment["stat"]["comments_count"] = replies_count.get(comment["id"], 0)
async def load_first_replies(comments, limit, offset, sort="newest"):
"""
Загружает первые N ответов для каждого комментария.
:param comments: Список комментариев, для которых нужно загрузить ответы.
:param limit: Максимальное количество ответов для каждого комментария.
:param offset: Смещение для пагинации дочерних комментариев.
:param sort: Порядок сортировки ответов.
"""
if not comments or limit <= 0:
return
# Собираем ID комментариев
comment_ids = [comment["id"] for comment in comments]
# Базовый запрос для загрузки ответов
q = query_reactions()
q = add_reaction_stat_columns(q)
# Фильтрация: только ответы на указанные комментарии
q = q.filter(
and_(
Reaction.reply_to.in_(comment_ids),
Reaction.deleted_at.is_(None),
Reaction.kind == ReactionKind.COMMENT.value,
)
)
# Группировка
q = q.group_by(Reaction.id, Author.id, Shout.id)
# Определяем сортировку
order_by_stmt = None
if sort.lower() == "oldest":
order_by_stmt = asc(Reaction.created_at)
elif sort.lower() == "like":
order_by_stmt = desc("rating_stat")
else: # "newest" по умолчанию
order_by_stmt = desc(Reaction.created_at)
q = q.order_by(order_by_stmt, Reaction.reply_to)
# Выполняем запрос - указываем limit для неограниченного количества ответов
# но не более 100 на родительский комментарий
replies = get_reactions_with_stat(q, limit=100, offset=0)
# Группируем ответы по родительским ID
replies_by_parent = {}
for reply in replies:
parent_id = reply.get("reply_to")
if parent_id not in replies_by_parent:
replies_by_parent[parent_id] = []
replies_by_parent[parent_id].append(reply)
# Добавляем ответы к соответствующим комментариям с учетом смещения и лимита
for comment in comments:
comment_id = comment["id"]
if comment_id in replies_by_parent:
parent_replies = replies_by_parent[comment_id]
# Применяем смещение и лимит
comment["first_replies"] = parent_replies[offset : offset + limit]
else:
comment["first_replies"] = []
# Загружаем количество ответов для дочерних комментариев
all_replies = [reply for replies in replies_by_parent.values() for reply in replies]
if all_replies:
await load_replies_count(all_replies)

View File

@@ -10,7 +10,7 @@ from orm.shout import Shout, ShoutAuthor, ShoutTopic
from orm.topic import Topic
from services.db import json_array_builder, json_builder, local_session
from services.schema import query
from services.search import search_text
from services.search import search_text, get_search_count
from services.viewed import ViewedStorage
from utils.logger import root_logger as logger
@@ -187,12 +187,10 @@ def get_shouts_with_links(info, q, limit=20, offset=0):
"""
shouts = []
try:
# logger.info(f"Starting get_shouts_with_links with limit={limit}, offset={offset}")
q = q.limit(limit).offset(offset)
with local_session() as session:
shouts_result = session.execute(q).all()
# logger.info(f"Got {len(shouts_result) if shouts_result else 0} shouts from query")
if not shouts_result:
logger.warning("No shouts found in query result")
@@ -203,7 +201,6 @@ def get_shouts_with_links(info, q, limit=20, offset=0):
shout = None
if hasattr(row, "Shout"):
shout = row.Shout
# logger.debug(f"Processing shout#{shout.id} at index {idx}")
if shout:
shout_id = int(f"{shout.id}")
shout_dict = shout.dict()
@@ -225,26 +222,22 @@ def get_shouts_with_links(info, q, limit=20, offset=0):
elif isinstance(row.stat, dict):
stat = row.stat
viewed = ViewedStorage.get_shout(shout_id=shout_id) or 0
shout_dict["stat"] = {**stat, "viewed": viewed, "commented": stat.get("comments_count", 0)}
shout_dict["stat"] = {**stat, "viewed": viewed}
# Обработка main_topic и topics
topics = None
if has_field(info, "topics") and hasattr(row, "topics"):
topics = orjson.loads(row.topics) if isinstance(row.topics, str) else row.topics
# logger.debug(f"Shout#{shout_id} topics: {topics}")
shout_dict["topics"] = topics
if has_field(info, "main_topic"):
main_topic = None
if hasattr(row, "main_topic"):
# logger.debug(f"Raw main_topic for shout#{shout_id}: {row.main_topic}")
main_topic = (
orjson.loads(row.main_topic) if isinstance(row.main_topic, str) else row.main_topic
)
# logger.debug(f"Parsed main_topic for shout#{shout_id}: {main_topic}")
if not main_topic and topics and len(topics) > 0:
# logger.info(f"No main_topic found for shout#{shout_id}, using first topic from list")
main_topic = {
"id": topics[0]["id"],
"title": topics[0]["title"],
@@ -252,10 +245,8 @@ def get_shouts_with_links(info, q, limit=20, offset=0):
"is_main": True,
}
elif not main_topic:
logger.warning(f"No main_topic and no topics found for shout#{shout_id}")
main_topic = {"id": 0, "title": "no topic", "slug": "notopic", "is_main": True}
shout_dict["main_topic"] = main_topic
# logger.debug(f"Final main_topic for shout#{shout_id}: {main_topic}")
if has_field(info, "authors") and hasattr(row, "authors"):
shout_dict["authors"] = (
@@ -282,7 +273,6 @@ def get_shouts_with_links(info, q, limit=20, offset=0):
logger.error(f"Fatal error in get_shouts_with_links: {e}", exc_info=True)
raise
finally:
logger.info(f"Returning {len(shouts)} shouts from get_shouts_with_links")
return shouts
@@ -401,33 +391,49 @@ async def load_shouts_search(_, info, text, options):
"""
limit = options.get("limit", 10)
offset = options.get("offset", 0)
if isinstance(text, str) and len(text) > 2:
# Get search results with pagination
results = await search_text(text, limit, offset)
scores = {}
hits_ids = []
for sr in results:
shout_id = sr.get("id")
if shout_id:
shout_id = str(shout_id)
scores[shout_id] = sr.get("score")
hits_ids.append(shout_id)
if not results:
logger.info(f"No search results found for '{text}'")
return []
# Extract IDs in the order from the search engine
hits_ids = [str(sr.get("id")) for sr in results if sr.get("id")]
q = (
query_with_stat(info)
if has_field(info, "stat")
else select(Shout).filter(and_(Shout.published_at.is_not(None), Shout.deleted_at.is_(None)))
)
# Query DB for only the IDs in the current page
q = query_with_stat(info)
q = q.filter(Shout.id.in_(hits_ids))
q = apply_filters(q, options)
q = apply_sorting(q, options)
shouts = get_shouts_with_links(info, q, limit, offset)
for shout in shouts:
shout.score = scores[f"{shout.id}"]
shouts.sort(key=lambda x: x.score, reverse=True)
return shouts
q = apply_filters(q, options.get("filters", {}))
shouts = get_shouts_with_links(info, q, len(hits_ids), 0)
# Reorder shouts to match the order from hits_ids
shouts_dict = {str(shout['id']): shout for shout in shouts}
ordered_shouts = [shouts_dict[shout_id] for shout_id in hits_ids if shout_id in shouts_dict]
return ordered_shouts
return []
@query.field("get_search_results_count")
async def get_search_results_count(_, info, text):
"""
Returns the total count of search results for a search query.
:param _: Root query object (unused)
:param info: GraphQL context information
:param text: Search query text
:return: Total count of results
"""
if isinstance(text, str) and len(text) > 2:
count = await get_search_count(text)
return {"count": count}
return {"count": 0}
@query.field("load_shouts_unrated")
async def load_shouts_unrated(_, info, options):
"""

View File

@@ -127,6 +127,28 @@ async def get_topics_with_stats(limit=100, offset=0, community_id=None, by=None)
"""
followers_stats = {row[0]: row[1] for row in session.execute(text(followers_stats_query))}
# Запрос на получение статистики авторов для выбранных тем
authors_stats_query = f"""
SELECT st.topic, COUNT(DISTINCT sa.author) as authors_count
FROM shout_topic st
JOIN shout s ON st.shout = s.id AND s.deleted_at IS NULL AND s.published_at IS NOT NULL
JOIN shout_author sa ON sa.shout = s.id
WHERE st.topic IN ({",".join(map(str, topic_ids))})
GROUP BY st.topic
"""
authors_stats = {row[0]: row[1] for row in session.execute(text(authors_stats_query))}
# Запрос на получение статистики комментариев для выбранных тем
comments_stats_query = f"""
SELECT st.topic, COUNT(DISTINCT r.id) as comments_count
FROM shout_topic st
JOIN shout s ON st.shout = s.id AND s.deleted_at IS NULL AND s.published_at IS NOT NULL
JOIN reaction r ON r.shout = s.id
WHERE st.topic IN ({",".join(map(str, topic_ids))})
GROUP BY st.topic
"""
comments_stats = {row[0]: row[1] for row in session.execute(text(comments_stats_query))}
# Формируем результат с добавлением статистики
result = []
for topic in topics:
@@ -134,6 +156,8 @@ async def get_topics_with_stats(limit=100, offset=0, community_id=None, by=None)
topic_dict["stat"] = {
"shouts": shouts_stats.get(topic.id, 0),
"followers": followers_stats.get(topic.id, 0),
"authors": authors_stats.get(topic.id, 0),
"comments": comments_stats.get(topic.id, 0),
}
result.append(topic_dict)
@@ -202,23 +226,6 @@ async def get_topics_all(_, _info):
return await get_all_topics()
# Запрос на получение тем с пагинацией и статистикой
@query.field("get_topics_paginated")
async def get_topics_paginated(_, _info, limit=100, offset=0, by=None):
"""
Получает список тем с пагинацией и статистикой.
Args:
limit: Максимальное количество возвращаемых тем
offset: Смещение для пагинации
by: Опциональные параметры сортировки
Returns:
list: Список тем с их статистикой
"""
return await get_topics_with_stats(limit, offset, None, by)
# Запрос на получение тем по сообществу
@query.field("get_topics_by_community")
async def get_topics_by_community(_, _info, community_id: int, limit=100, offset=0, by=None):

View File

@@ -33,7 +33,6 @@ input DraftInput {
main_topic_id: Int # Changed from main_topic: Topic
media: [MediaItemInput] # Changed to use MediaItemInput
lead: String
description: String
subtitle: String
lang: String
seo: String

View File

@@ -4,7 +4,7 @@ type Query {
get_author_id(user: String!): Author
get_authors_all: [Author]
load_authors_by(by: AuthorsBy!, limit: Int, offset: Int): [Author]
# search_authors(what: String!): [Author]
load_authors_search(text: String!, limit: Int, offset: Int): [Author!] # Search for authors by name or bio
# community
get_community: Community
@@ -26,10 +26,14 @@ type Query {
load_shout_ratings(shout: Int!, limit: Int, offset: Int): [Reaction]
load_comment_ratings(comment: Int!, limit: Int, offset: Int): [Reaction]
# branched comments pagination
load_comments_branch(shout: Int!, parent_id: Int, limit: Int, offset: Int, sort: ReactionSort, children_limit: Int, children_offset: Int): [Reaction]
# reader
get_shout(slug: String, shout_id: Int): Shout
load_shouts_by(options: LoadShoutsOptions): [Shout]
load_shouts_search(text: String!, options: LoadShoutsOptions): [SearchResult]
get_search_results_count(text: String!): CountResult!
load_shouts_bookmarked(options: LoadShoutsOptions): [Shout]
# rating
@@ -57,7 +61,7 @@ type Query {
get_topic(slug: String!): Topic
get_topics_all: [Topic]
get_topics_by_author(slug: String, user: String, author_id: Int): [Topic]
get_topics_by_community(slug: String, community_id: Int): [Topic]
get_topics_by_community(community_id: Int!, limit: Int, offset: Int): [Topic]
# notifier
load_notifications(after: Int!, limit: Int, offset: Int): NotificationsResult!

View File

@@ -55,6 +55,7 @@ type Reaction {
stat: Stat
oid: String
# old_thread: String
first_replies: [Reaction]
}
type MediaItem {
@@ -79,7 +80,6 @@ type Shout {
layout: String!
lead: String
description: String
subtitle: String
lang: String
cover: String
@@ -99,6 +99,7 @@ type Shout {
featured_at: Int
deleted_at: Int
seo: String # generated if not set
version_of: Shout # TODO: use version_of somewhere
draft: Draft
media: [MediaItem]
@@ -110,13 +111,12 @@ type Draft {
id: Int!
created_at: Int!
created_by: Author!
community: Community!
layout: String
slug: String
title: String
subtitle: String
lead: String
description: String
body: String
media: [MediaItem]
cover: String
@@ -136,7 +136,7 @@ type Draft {
type Stat {
rating: Int
commented: Int
comments_count: Int
viewed: Int
last_commented_at: Int
}
@@ -207,6 +207,7 @@ type CommonResult {
}
type SearchResult {
id: Int!
slug: String!
title: String!
cover: String
@@ -274,3 +275,7 @@ type MyRateComment {
my_rate: ReactionKind
}
type CountResult {
count: Int!
}

View File

@@ -19,7 +19,7 @@ from sqlalchemy import (
inspect,
text,
)
from sqlalchemy.orm import Session, configure_mappers, declarative_base
from sqlalchemy.orm import Session, configure_mappers, declarative_base, joinedload
from sqlalchemy.sql.schema import Table
from settings import DB_URL
@@ -259,3 +259,32 @@ def get_json_builder():
# Используем их в коде
json_builder, json_array_builder, json_cast = get_json_builder()
# Fetch all shouts, with authors preloaded
# This function is used for search indexing
async def fetch_all_shouts(session=None):
"""Fetch all published shouts for search indexing with authors preloaded"""
from orm.shout import Shout
close_session = False
if session is None:
session = local_session()
close_session = True
try:
# Fetch only published and non-deleted shouts with authors preloaded
query = session.query(Shout).options(
joinedload(Shout.authors)
).filter(
Shout.published_at.is_not(None),
Shout.deleted_at.is_(None)
)
shouts = query.all()
return shouts
except Exception as e:
logger.error(f"Error fetching shouts for search indexing: {e}")
return []
finally:
if close_session:
session.close()

File diff suppressed because it is too large Load Diff

View File

@@ -2,9 +2,7 @@ import asyncio
import os
import time
from datetime import datetime, timedelta, timezone
from typing import Dict
import orjson
from typing import Dict, Optional
# ga
from google.analytics.data_v1beta import BetaAnalyticsDataClient
@@ -20,33 +18,39 @@ from orm.author import Author
from orm.shout import Shout, ShoutAuthor, ShoutTopic
from orm.topic import Topic
from services.db import local_session
from services.redis import redis
from utils.logger import root_logger as logger
GOOGLE_KEYFILE_PATH = os.environ.get("GOOGLE_KEYFILE_PATH", "/dump/google-service.json")
GOOGLE_PROPERTY_ID = os.environ.get("GOOGLE_PROPERTY_ID", "")
VIEWS_FILEPATH = "/dump/views.json"
class ViewedStorage:
"""
Класс для хранения и доступа к данным о просмотрах.
Использует Redis в качестве основного хранилища и Google Analytics для сбора новых данных.
"""
lock = asyncio.Lock()
precounted_by_slug = {}
views_by_shout = {}
shouts_by_topic = {}
shouts_by_author = {}
views = None
period = 60 * 60 # каждый час
analytics_client: BetaAnalyticsDataClient | None = None
analytics_client: Optional[BetaAnalyticsDataClient] = None
auth_result = None
running = False
redis_views_key = None
last_update_timestamp = 0
start_date = datetime.now().strftime("%Y-%m-%d")
@staticmethod
async def init():
"""Подключение к клиенту Google Analytics с использованием аутентификации"""
"""Подключение к клиенту Google Analytics и загрузка данных о просмотрах из Redis"""
self = ViewedStorage
async with self.lock:
# Загрузка предварительно подсчитанных просмотров из файла JSON
self.load_precounted_views()
# Загрузка предварительно подсчитанных просмотров из Redis
await self.load_views_from_redis()
os.environ.setdefault("GOOGLE_APPLICATION_CREDENTIALS", GOOGLE_KEYFILE_PATH)
if GOOGLE_KEYFILE_PATH and os.path.isfile(GOOGLE_KEYFILE_PATH):
@@ -62,40 +66,54 @@ class ViewedStorage:
self.running = False
@staticmethod
def load_precounted_views():
"""Загрузка предварительно подсчитанных просмотров из файла JSON"""
async def load_views_from_redis():
"""Загрузка предварительно подсчитанных просмотров из Redis"""
self = ViewedStorage
viewfile_path = VIEWS_FILEPATH
if not os.path.exists(viewfile_path):
viewfile_path = os.path.join(os.path.curdir, "views.json")
if not os.path.exists(viewfile_path):
logger.warning(" * views.json not found")
return
logger.info(f" * loading views from {viewfile_path}")
try:
start_date_int = os.path.getmtime(viewfile_path)
start_date_str = datetime.fromtimestamp(start_date_int).strftime("%Y-%m-%d")
self.start_date = start_date_str
# Подключаемся к Redis если соединение не установлено
if not redis._client:
await redis.connect()
# Получаем список всех ключей migrated_views_* и находим самый последний
keys = await redis.execute("KEYS", "migrated_views_*")
if not keys:
logger.warning(" * No migrated_views keys found in Redis")
return
# Фильтруем только ключи timestamp формата (исключаем migrated_views_slugs)
timestamp_keys = [k for k in keys if k != "migrated_views_slugs"]
if not timestamp_keys:
logger.warning(" * No migrated_views timestamp keys found in Redis")
return
# Сортируем по времени создания (в названии ключа) и берем последний
timestamp_keys.sort()
latest_key = timestamp_keys[-1]
self.redis_views_key = latest_key
# Получаем метку времени создания для установки start_date
timestamp = await redis.execute("HGET", latest_key, "_timestamp")
if timestamp:
self.last_update_timestamp = int(timestamp)
timestamp_dt = datetime.fromtimestamp(int(timestamp))
self.start_date = timestamp_dt.strftime("%Y-%m-%d")
# Если данные сегодняшние, считаем их актуальными
now_date = datetime.now().strftime("%Y-%m-%d")
if now_date == self.start_date:
logger.info(" * views data is up to date!")
logger.info(" * Views data is up to date!")
else:
logger.warn(f" * {viewfile_path} is too old: {self.start_date}")
logger.warning(f" * Views data is from {self.start_date}, may need update")
with open(viewfile_path, "r") as file:
precounted_views = orjson.loads(file.read())
self.precounted_by_slug.update(precounted_views)
logger.info(f" * {len(precounted_views)} shouts with views was loaded.")
except Exception as e:
logger.error(f"precounted views loading error: {e}")
# Выводим информацию о количестве загруженных записей
total_entries = await redis.execute("HGET", latest_key, "_total")
if total_entries:
logger.info(f" * {total_entries} shouts with views loaded from Redis key: {latest_key}")
# noinspection PyTypeChecker
@staticmethod
async def update_pages():
"""Запрос всех страниц от Google Analytics, отсортрованных по количеству просмотров"""
"""Запрос всех страниц от Google Analytics, отсортированных по количеству просмотров"""
self = ViewedStorage
logger.info(" ⎧ views update from Google Analytics ---")
if self.running:
@@ -140,15 +158,40 @@ class ViewedStorage:
self.running = False
@staticmethod
def get_shout(shout_slug="", shout_id=0) -> int:
"""Получение метрики просмотров shout по slug или id."""
async def get_shout(shout_slug="", shout_id=0) -> int:
"""
Получение метрики просмотров shout по slug или id.
Args:
shout_slug: Slug публикации
shout_id: ID публикации
Returns:
int: Количество просмотров
"""
self = ViewedStorage
# Получаем данные из Redis для новой схемы хранения
if not redis._client:
await redis.connect()
fresh_views = self.views_by_shout.get(shout_slug, 0)
precounted_views = self.precounted_by_slug.get(shout_slug, 0)
return fresh_views + precounted_views
# Если есть id, пытаемся получить данные из Redis по ключу migrated_views_<timestamp>
if shout_id and self.redis_views_key:
precounted_views = await redis.execute("HGET", self.redis_views_key, str(shout_id))
if precounted_views:
return fresh_views + int(precounted_views)
# Если нет id или данных, пытаемся получить по slug из отдельного хеша
precounted_views = await redis.execute("HGET", "migrated_views_slugs", shout_slug)
if precounted_views:
return fresh_views + int(precounted_views)
return fresh_views
@staticmethod
def get_shout_media(shout_slug) -> Dict[str, int]:
async def get_shout_media(shout_slug) -> Dict[str, int]:
"""Получение метрики воспроизведения shout по slug."""
self = ViewedStorage
@@ -157,23 +200,29 @@ class ViewedStorage:
return self.views_by_shout.get(shout_slug, 0)
@staticmethod
def get_topic(topic_slug) -> int:
async def get_topic(topic_slug) -> int:
"""Получение суммарного значения просмотров темы."""
self = ViewedStorage
return sum(self.views_by_shout.get(shout_slug, 0) for shout_slug in self.shouts_by_topic.get(topic_slug, []))
views_count = 0
for shout_slug in self.shouts_by_topic.get(topic_slug, []):
views_count += await self.get_shout(shout_slug=shout_slug)
return views_count
@staticmethod
def get_author(author_slug) -> int:
async def get_author(author_slug) -> int:
"""Получение суммарного значения просмотров автора."""
self = ViewedStorage
return sum(self.views_by_shout.get(shout_slug, 0) for shout_slug in self.shouts_by_author.get(author_slug, []))
views_count = 0
for shout_slug in self.shouts_by_author.get(author_slug, []):
views_count += await self.get_shout(shout_slug=shout_slug)
return views_count
@staticmethod
def update_topics(shout_slug):
"""Обновление счетчиков темы по slug shout"""
self = ViewedStorage
with local_session() as session:
# Определение вспомогательной функции для избежа<EFBFBD><EFBFBD>ия повторения кода
# Определение вспомогательной функции для избежания повторения кода
def update_groups(dictionary, key, value):
dictionary[key] = list(set(dictionary.get(key, []) + [value]))