2
0
Commit Graph

28 Commits

Author SHA1 Message Date
wxiaoguang
230523a45f Fix bleve fuzziness search (#33078)
Close #31565
2025-01-03 00:32:02 +08:00
cassio zareck
bd1488aade Fix settings not being loaded at CLI (#26402)
Closes #25898
The problem was that the default settings weren't being loaded

---------

Signed-off-by: cassiozareck <cassiomilczareck@gmail.com>
Co-authored-by: Lunny Xiao <xiaolunwen@gmail.com>
Co-authored-by: wxiaoguang <wxiaoguang@gmail.com>
2024-12-30 05:54:20 +00:00
wxiaoguang
12248c8076 Improve grep search (#30843)
Reduce the context line number to 1, make "git grep" search respect the
include/exclude patter, and fix #30785
2024-05-03 09:13:48 +00:00
wxiaoguang
20226c5859 Refactor startup deprecation messages (#30305)
It doesn't change logic, it only does:

1. Rename the variable and function names
2. Use more consistent format when mentioning config section&key
3. Improve some messages
2024-04-07 01:11:25 +00:00
wxiaoguang
ca17aecee4 Do not allow different storage configurations to point to the same directory (#30169)
Replace #29171
2024-03-31 03:03:24 +00:00
Lunny Xiao
62ab887616 Disallow duplicate storage paths (#26484)
Replace #26380
2024-02-09 14:06:03 +00:00
techknowlogick
00efb81a8d Allow skipping forks and mirrors from being indexed (#23187)
This PR adds two new options to disable repo/code search indexing of
both forks and mirrors.

Related: #22842
2023-05-25 16:13:47 +08:00
wxiaoguang
2b179c4f05 Rewrite queue (#24505)
# ⚠️ Breaking

Many deprecated queue config options are removed (actually, they should
have been removed in 1.18/1.19).

If you see the fatal message when starting Gitea: "Please update your
app.ini to remove deprecated config options", please follow the error
messages to remove these options from your app.ini.

Example:

```
2023/05/06 19:39:22 [E] Removed queue option: `[indexer].ISSUE_INDEXER_QUEUE_TYPE`. Use new options in `[queue.issue_indexer]`
2023/05/06 19:39:22 [E] Removed queue option: `[indexer].UPDATE_BUFFER_LEN`. Use new options in `[queue.issue_indexer]`
2023/05/06 19:39:22 [F] Please update your app.ini to remove deprecated config options
```

Many options in `[queue]` are are dropped, including:
`WRAP_IF_NECESSARY`, `MAX_ATTEMPTS`, `TIMEOUT`, `WORKERS`,
`BLOCK_TIMEOUT`, `BOOST_TIMEOUT`, `BOOST_WORKERS`, they can be removed
from app.ini.

# The problem

The old queue package has some legacy problems:

* complexity: I doubt few people could tell how it works.
* maintainability: Too many channels and mutex/cond are mixed together,
too many different structs/interfaces depends each other.
* stability: due to the complexity & maintainability, sometimes there
are strange bugs and difficult to debug, and some code doesn't have test
(indeed some code is difficult to test because a lot of things are mixed
together).
* general applicability: although it is called "queue", its behavior is
not a well-known queue.
* scalability: it doesn't seem easy to make it work with a cluster
without breaking its behaviors.

It came from some very old code to "avoid breaking", however, its
technical debt is too heavy now. It's a good time to introduce a better
"queue" package.

# The new queue package

It keeps using old config and concept as much as possible.

* It only contains two major kinds of concepts:
    * The "base queue": channel, levelqueue, redis
* They have the same abstraction, the same interface, and they are
tested by the same testing code.
* The "WokerPoolQueue", it uses the "base queue" to provide "worker
pool" function, calls the "handler" to process the data in the base
queue.
* The new code doesn't do "PushBack"
* Think about a queue with many workers, the "PushBack" can't guarantee
the order for re-queued unhandled items, so in new code it just does
"normal push"
* The new code doesn't do "pause/resume"
* The "pause/resume" was designed to handle some handler's failure: eg:
document indexer (elasticsearch) is down
* If a queue is paused for long time, either the producers blocks or the
new items are dropped.
* The new code doesn't do such "pause/resume" trick, it's not a common
queue's behavior and it doesn't help much.
* If there are unhandled items, the "push" function just blocks for a
few seconds and then re-queue them and retry.
* The new code doesn't do "worker booster"
* Gitea's queue's handlers are light functions, the cost is only the
go-routine, so it doesn't make sense to "boost" them.
* The new code only use "max worker number" to limit the concurrent
workers.
* The new "Push" never blocks forever
* Instead of creating more and more blocking goroutines, return an error
is more friendly to the server and to the end user.

There are more details in code comments: eg: the "Flush" problem, the
strange "code.index" hanging problem, the "immediate" queue problem.

Almost ready for review.

TODO:

* [x] add some necessary comments during review
* [x] add some more tests if necessary
* [x] update documents and config options
* [x] test max worker / active worker
* [x] re-run the CI tasks to see whether any test is flaky
* [x] improve the `handleOldLengthConfiguration` to provide more
friendly messages
* [x] fine tune default config values (eg: length?)

## Code coverage:

![image](https://user-images.githubusercontent.com/2114189/236620635-55576955-f95d-4810-b12f-879026a3afdf.png)
2023-05-08 19:49:59 +08:00
techknowlogick
6583978561 Add meilisearch support (#23136)
Add meilisearch support

Fixes #20665
2023-03-28 22:23:23 -04:00
Lunny Xiao
67f0f6cb3f handle deprecated settings (#22992)
Fix #22736
2023-02-20 16:18:26 -06:00
Lunny Xiao
619c304370 Refactor the setting to make unit test easier (#22405)
Some bugs caused by less unit tests in fundamental packages. This PR
refactor `setting` package so that create a unit test will be easier
than before.

- All `LoadFromXXX` files has been splited as two functions, one is
`InitProviderFromXXX` and `LoadCommonSettings`. The first functions will
only include the code to create or new a ini file. The second function
will load common settings.
- It also renames all functions in setting from `newXXXService` to
`loadXXXSetting` or `loadXXXFrom` to make the function name less
confusing.
- Move `XORMLog` to `SQLLog` because it's a better name for that.

Maybe we should finally move these `loadXXXSetting` into the `XXXInit`
function? Any idea?

---------

Co-authored-by: 6543 <6543@obermui.de>
Co-authored-by: delvh <dev.lh@web.de>
2023-02-20 00:12:01 +08:00
flynnnnnnnnnn
487cb6a411 Implement FSFE REUSE for golang files (#21840)
Change all license headers to comply with REUSE specification.

Fix #16132

Co-authored-by: flynnnnnnnnnn <flynnnnnnnnnn@github>
Co-authored-by: John Olheiser <john.olheiser@gmail.com>
2022-11-27 18:20:29 +00:00
6543
e0a2bd79c9 format with gofumpt (#18184)
* gofumpt -w -l .

* gofumpt -w -l -extra .

* Add linter

* manual fix

* change make fmt
2022-01-20 18:46:10 +01:00
Gusted
15ad99eccf Enable deprecation error for v1.17.0 (#18341)
Co-authored-by: Andrew Thornton <art27@cantab.net>
2022-01-20 18:00:38 +01:00
luzpaz
34cc88bca1 Fix various documentation, user-facing, and source comment typos (#16367)
* Fix various doc, user-facing, and source comment typos

Found via `codespell -q 3 -S ./options/locale,./vendor -L ba,pullrequest,pullrequests,readby`
2021-07-08 13:38:13 +02:00
zeripath
cd5f72eb8e Clean-up the settings hierarchy for issue_indexer queue (#16001)
There are a couple of settings in `[indexer]` relating to the `issue_indexer` queue
which override settings in unpredictable ways. This PR adjusts this hierarchy and makes
explicit that these settings are deprecated.

Signed-off-by: Andrew Thornton <art27@cantab.net>

Co-authored-by: techknowlogick <techknowlogick@gitea.io>
2021-06-16 18:19:20 -04:00
zeripath
f67bd624ce Use filepath.ToSlash and Join in indexer defaults and queues (#15971)
As revealed by #15964 there is inconsistent use of filepath Join and path Join
for these directories. The best thing to do is to use filepath.Join but then ToSlash
them for consistency.

Signed-off-by: Andrew Thornton <art27@cantab.net>

Co-authored-by: John Olheiser <john.olheiser@gmail.com>
2021-05-25 22:50:35 -04:00
zeripath
8c7f6110a2 Change default queue settings to be low go-routines (#15964)
This PR suggests a change to the default configuration for queues:

* Use a common DATADIR for the queues
* Set starting workers to 0 and make boost a single worker

Signed-off-by: Andrew Thornton <art27@cantab.net>
2021-05-24 02:23:55 +03:00
zeripath
b28a013692 Avoid setting the CONN_STR in issue indexer queue unless it is meant to be set (#13069)
Since the move to common leveldb and common redis the disk queue code (#12385)
will check the connection string before defaulting to the DATADIR.

Therefore we should ensure that the connection string is kept empty
unless it is actually set.

Unforunately the issue indexer was missed in #13025 this PR fixes this omission

Fix #13062

Signed-off-by: Andrew Thornton <art27@cantab.net>
2020-10-07 23:24:41 +01:00
Lunny Xiao
e928107893 Support elastic search for code search (#10273)
* Support elastic search for code search

* Finished elastic search implementation and add some tests

* Enable test on drone and added docs

* Add new fields to elastic search

* Fix bug

* remove unused changes

* Use indexer alias to keep the gitea indexer version

* Improve codes

* Some code improvements

* The real indexer name changed to xxx.v1

Co-authored-by: zeripath <art27@cantab.net>
2020-08-30 19:08:01 +03:00
Lauris BH
1e785f357f Add detected file language to code search (#10256)
Move langauge detection to separate module to be more reusable

Add option to disable vendored file exclusion from file search

Allways show all language stats for search
2020-02-20 16:53:55 -03:00
Lunny Xiao
955915a31f Issue search support elasticsearch (#9428)
* Issue search support elasticsearch

* Fix lint

* Add indexer name on app.ini

* add a warnning on SearchIssuesByKeyword

* improve code
2020-02-13 14:06:17 +08:00
Lunny Xiao
ad8328393f Refactor code indexer (#9313)
* Refactor code indexer

* fix test

* fix test

* refactor code indexer

* fix import

* improve code

* fix typo

* fix test and make code clean

* fix lint
2019-12-23 20:31:16 +08:00
zeripath
de7ae6bad5 Restore Graceful Restarting & Socket Activation (#7274)
* Prevent deadlock in indexer initialisation during graceful restart

* Move from gracehttp to our own service to add graceful ssh

* Add timeout for start of indexers and make hammer time configurable

* Fix issue with re-initialization in indexer during tests

* move the code to detect use of closed to graceful

* Handle logs gracefully - add a pid suffix just before restart

* Move to using a cond and a holder for indexers

* use time.Since

* Add some comments and attribution

* update modules.txt

* Use zero to disable timeout

* Move RestartProcess to its own file

* Add cleanup routine
2019-10-15 14:39:51 +01:00
guillep2k
1a69511b30 Restrict repository indexing by glob match (#7767)
* Restrict repository indexing by file extension

* Use REPO_EXTENSIONS_LIST_INCLUDE instead of REPO_EXTENSIONS_LIST_EXCLUDE and have a more flexible extension pattern

* Corrected to pass lint gosimple

* Add wildcard support to REPO_INDEXER_EXTENSIONS

* This reverts commit 72a650c8e42f4abf59d5df7cd5dc27b451494cc6.

* Add wildcard support to REPO_INDEXER_EXTENSIONS (no make vendor)

* Simplify isIndexable() for better clarity

* Add gobwas/glob to vendors

* manually set appengine new release

* Implement better REPO_INDEXER_INCLUDE and REPO_INDEXER_EXCLUDE

* Add unit and integration tests

* Update app.ini.sample and reword config-cheat-sheet

* Add doc page and correct app.ini.sample

* Some polish on the doc

* Simplify code as suggested by @lafriks
2019-09-11 20:26:28 +03:00
Lunny Xiao
6f601b1a06 Issue indexer queue redis support (#6218)
* add redis queue

* finished indexer redis queue

* add redis vendor

* fix vet

* Update docs/content/doc/advanced/config-cheat-sheet.en-us.md

Co-Authored-By: lunny <xiaolunwen@gmail.com>

* switch to go mod

* Update required changes for new logging func signatures
2019-04-08 12:05:15 +03:00
Lunny Xiao
c59ca5c20e Add more tests and docs for issue indexer, add db indexer type for searching from database (#6144)
* add more tests and docs for issue indexer, add db indexer type for searching from database

* fix typo

* fix typo

* fix lint

* improve docs
2019-02-21 13:01:28 +08:00
Lunny Xiao
00227e19d1 Refactor issue indexer (#5363) 2019-02-19 09:39:39 -05:00