summaryrefslogtreecommitdiffstats
path: root/2024/talks/p-search.md
diff options
context:
space:
mode:
authorSacha Chua <sacha@sachachua.com>2024-12-10 13:15:03 -0500
committerSacha Chua <sacha@sachachua.com>2024-12-10 13:15:03 -0500
commit558d28b033396d4384e8cf36a36ae8f3d0c0f9b1 (patch)
treefcb58c935546b02f3940868f9a09b13508951c25 /2024/talks/p-search.md
parent3dc72eb6ca41ddf41d00669e168648ef05aab535 (diff)
downloademacsconf-wiki-558d28b033396d4384e8cf36a36ae8f3d0c0f9b1.tar.xz
emacsconf-wiki-558d28b033396d4384e8cf36a36ae8f3d0c0f9b1.zip
add notes to p-search
Diffstat (limited to '2024/talks/p-search.md')
-rw-r--r--2024/talks/p-search.md176
1 files changed, 176 insertions, 0 deletions
diff --git a/2024/talks/p-search.md b/2024/talks/p-search.md
index 01e8aed7..59d16379 100644
--- a/2024/talks/p-search.md
+++ b/2024/talks/p-search.md
@@ -53,6 +53,182 @@ tools.
Code: <https://github.com/zkry/p-search>
+# Discussion
+
+## Questions and answers
+
+- Q: Do you think a reduced version of this functionality could be
+ integrated into isearch?  Right now you can turn on various flags
+ when using isearch with M-s \<key\>, like M-s SPC to match spaces
+ literally.  Is it possible to add a flag to \"search the buffer
+ semantically\"? (Ditto with M-x occur, which is more similar to your
+ buffer-oriented results interface)
+ - A: it\'s essencially a framwork so you would create a generator;
+ but it does not exist yet.
+- Q: Any idea how this would work with personal information like
+ Zettlekastens? 
+ - A: Useable as is, because all the files are in directory. So
+ only have to set the files to search in only. You can then add
+ information to ignore some files (like daily notes).
+ Documentation is coming.
+- Q: How good does the search work for synonyms especially if you use
+ different languages?
+ - A: There is an entire field of search to translate the word that
+ is inputted to normalize it (like plural -\> singular
+ transformation). Currently p-search does not address this. 
+ - A: for different languages it gets complicated (vector search
+ possible, but might be too slow in Elisp).
+- Q: When searching by author I know authors may setup a new machine
+ and not put the exact same information. Is this doing anything to
+ combine those into one author?
+ - A: Currently using the git command. So if you know the emails
+ the author have used, you can add different priors.
+- Q: A cool more powerful grep \"Rak\" to use and maybe has some good
+ ideas in increasing the value of searches, for example using Raku
+ code while searching. is Rak written in Raku. Have you seen it? 
+ - [https://github.com/lizmat/App-Rak](https://github.com/lizmat/App-Rak){rel="noreferrer noopener"}
+ - [https://www.youtube.com/watch?v=YkjGNV4dVio&t=167s&pp=ygURYXBwIHJhayByYWt1IGdyZXA%3D](https://www.youtube.com/watch?v=YkjGNV4dVio&t=167s&pp=ygURYXBwIHJhayByYWt1IGdyZXA%3D){rel="noreferrer noopener"} 
+ - A: I have to look into that. Tree-sitter AST would also be cool
+ to include to have a better search.
+- Q: Have you thought about integrating results from using cosine
+ similarity with a deep-learning based vector embedding?  This will
+ let us search for \"fruit\" and get back results that have \"apple\"
+ or \"grapes\" in them \-- that kind of thing.  It will probably also
+ handle the case of terms that could be abbreviated/formatted
+ differently like in your initial example.
+ - A: Goes back to semantic search. Probably can be implemented,
+ but also probably too slow. And it is hard to get the embeddings
+ and the system running on the machine.
+- Q:  I missed the start of the talk, so apologies if this has been
+ covered - is it possible to save/bookmark searches or search
+ templates so they can be used again and again?
+ - A: Exactly.  I just recently added bookmarking capabilities, so
+ we can bookmark and rerun our searches from where we left off. 
+ I tried to create a one-to-one mapping from the search object to
+ the search object - there is a command to do this- to get a data
+ representation of the search, to get a custom plist and resume
+ the search where we left off, which can be used to create
+ command to trigger a prior search.
+- Q: You mentioned about candidate generators. Could you explain about
+ to what the score is assigned to. Is it to a line or whatever the
+ candidate generates? How does it work with rg in your demo?
+
+   FOLLOW-UP: How does the git scoring thingy hook into this?\
+
+- - A: Candidate generator produces documents. Documents have
+ properties (like an id and a path). From that you get
+ subproperties like the content of the document. Each candidate
+ generator know how to search in the files (emails, buffers,
+ files, urls, \...). There is only the notion of score +
+ document.
+ - Then another method is used to extract the lines that matches in
+ the document (to show precisely the lines that matches).
+
+- Q: Hearing about this makes me think about how nice the emergent
+ workflow with denote using easy filtering with orderless. It is
+ really easy searching for file tags, titles etc. and do things with
+ them. Did this or something like this help or infulce the design of
+ psearch?
+ - A: You can search for whatever you want. No hardcoding is
+ possible for anything (file, directories, tags, titlese\...).
+
+- Q: \[comments from IRC\] \<NullNix\> git covers the \"multiple
+ names\" thing itself: see .mailmap  10:51:19 
+ - \<NullNix\> thiis is a git feature, p-search shouldn\'t need to
+ implement it  10:51:34 
+ - \<NullNix\> To me this seems to have similarities to notmuch \--
+ honestly I want notmuch with the p-search UI :) (of course,
+ notmuch uses a xapian index, because repeatedly grepping all
+ traffic on huge mailing lists would be insane.)  10:55:30 
+ - \<NullNix\> (notmuch also has bookmark-like things as a core
+ feature, but no real weighting like p-search does.)  10:56:07 
+ - A: I have not used notmuch, but many extensions are
+ possible. mu4e is using  a full index for the search. This
+ could be adapted here to with the SQL database as source. 
+
+- Q: You can search a buffer using ripgrep by feeding it in as stdin
+ to the ripgrep process, can\'t you?
+ - A: Yes you can. But the aim is to search many different things
+ in elisp. So there is a mechanism in psearch anyway to be able
+ to represent anything including buffers. This is working pretty
+ well.
+
+- Q:  Thanks for making this lovely thing, I\'m looking forward to
+ trying it out.  Seems modular and well thought out. Questions about
+ integreation and about the interface
+ - A: project.el is used to search only in the local files of the
+ project (as done by default)
+
+- Q: how happy are you with the interface?
+ - A: psearch is going over the entire files trying to find the
+ best. Many features can be added, e.g., to improve debuggability
+ (is this highly ranked due to a bug? due to a high weight? many
+ matching documents?)
+ - A: hopefully will be on ELPA at some point with proper
+ documentation.
+
+- Q: Remembering searches is not available everywhere (rg.el? but AI
+ package like gptel already have it). Also useful for using the
+ document in the future.
+ - A: Retrievel augmented generation: p-search could be used for
+ the search, combining it with an AI to fine-tune the search with
+ a Q-A workflow. Although currently no API.  
+ - (gptel author here: I\'m looking forward to seeing if I can use
+ gptel with p-search)
+ - A: as the results are surprisingly good, why is that not used
+ anywhere else? But there is a lot of setup to get it right. You
+ need to something like emacs with many configuration (transient
+ is helping to do that) without scaring the users. 
+ - Everyone uses emacs differently, so unclear how people will
+ really use it. (PlasmaStrike) For example consult-omni
+ (elfeed-tube, \...) searching multiple webpages at the same
+ time, with orderless. However, no webpage offers this option.
+ Somehow those tools stay in emacs only. (Corwin Brust) This is
+ the strength of emacs: people invest a lot of time to improve
+ their workflow from tomorrow. \[see xkcd on emacs learning curve
+ vs nano vs vim\]
+ - [https://github.com/armindarvish/consult-omni](https://github.com/armindarvish/consult-omni){rel="noreferrer noopener"}
+ - [https://github.com/karthink/elfeed-tube](https://github.com/karthink/elfeed-tube){rel="noreferrer noopener"}
+ - [https://www.reddit.com/r/ProgrammerHumor/comments/9d6f19/text_editor_learning_curves_fixed/](https://www.reddit.com/r/ProgrammerHumor/comments/9d6f19/text_editor_learning_curves_fixed/){rel="noreferrer noopener"}
+ - A: emacs is not the most beginner friendly, but the solution
+ space is very large
+ - (Corwin Brust) Emacs supports all approaches and is extensible.
+ (PlasmaStrike) Youtube much larger, but somehow does not have
+ this nice sane interface.
+
+- Q: Do you think the Emacs being kinda slow will get in the way of
+ being able to run a lot of scoring algorithms?
+ - A: The code currently is dumb in a lot of places (like going of
+ all files to calculate a score), but that is not that slow
+ surprisingly. Elisp enumerating all files and multiplying
+ numbers in the emacs repo isn\'t really slow. But if you have to
+ search in files, this will be slow without relying on ripgrep on
+ a faster tool. Take for example the search in info files / elisp
+ info files, the search in elisp is almost instant. For
+ human-size documents, probably fast enough \-- and if not, there
+ is room for optimizations. For coompany-size documents (like
+ repos), could be too small.
+
+- Q: When do you have to make something more complicated to scale
+ better?
+ - A: I do not know yet really. I try to automate tasks as much as
+ possible, like in the emacs configuration meme \"not doing work
+ I have to do the configuration\". Usually I do not add web-based
+ things into emacs.
+
+## Notes
+
+- I like the dedicated-buffer interface (I\'m assuming using
+ magit-section and transient).
+- \<meain\> Very interesting ideas. I was very happy when I was able
+ to do simple
+-                 filters with orderless, but this is great \[11:46\]
+- \<NullNix\> I dunno about you, but I want to start using p-search
+ yesterday.
+-                     (possibly integrating lsp-based tokens
+ somehow\...) \[11:44\]
+- \<codeasone\> Awesome job Ryota, thank you for sharing! 
+
[[!inline pages="internal(2024/info/p-search-after)" raw="yes"]]