summaryrefslogblamecommitdiffstats
path: root/2024/captions/emacsconf-2024-p-search--psearch-a-local-search-engine-in-emacs--zac-romero--answers.vtt
blob: cd09d3cb07edfd951d1fec7103bfc8e556cc2fb9 (plain) (tree)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
1001
1002
1003
1004
1005
1006
1007
1008
1009
1010
1011
1012
1013
1014
1015
1016
1017
1018
1019
1020
1021
1022
1023
1024
1025
1026
1027
1028
1029
1030
1031
1032
1033
1034
1035
1036
1037
1038
1039
1040
1041
1042
1043
1044
1045
1046
1047
1048
1049
1050
1051
1052
1053
1054
1055
1056
1057
1058
1059
1060
1061
1062
1063
1064
1065
1066
1067
1068
1069
1070
1071
1072
1073
1074
1075
1076
1077
1078
1079
1080
1081
1082
1083
1084
1085
1086
1087
1088
1089
1090
1091
1092
1093
1094
1095
1096
1097
1098
1099
1100
1101
1102
1103
1104
1105
1106
1107
1108
1109
1110
1111
1112
1113
1114
1115
1116
1117
1118
1119
1120
1121
1122
1123
1124
1125
1126
1127
1128
1129
1130
1131
1132
1133
1134
1135
1136
1137
1138
1139
1140
1141
1142
1143
1144
1145
1146
1147
1148
1149
1150
1151
1152
1153
1154
1155
1156
1157
1158
1159
1160
1161
1162
1163
1164
1165
1166
1167
1168
1169
1170
1171
1172
1173
1174
1175
1176
1177
1178
1179
1180
1181
1182
1183
1184
1185
1186
1187
1188
1189
1190
1191
1192
1193
1194
1195
1196
1197
1198
1199
1200
1201
1202
1203
1204
1205
1206
1207
1208
1209
1210
1211
1212
1213
1214
1215
1216
1217
1218
1219
1220
1221
1222
1223
1224
1225
1226
1227
1228
1229
1230
1231
1232
1233
1234
1235
1236
1237
1238
1239
1240
1241
1242
1243
1244
1245
1246
1247
1248
1249
1250
1251
1252
1253
1254
1255
1256
1257
1258
1259
1260
1261
1262
1263
1264
1265
1266
1267
1268
1269
1270
1271
1272
1273
1274
1275
1276
1277
1278
1279
1280
1281
1282
1283
1284
1285
1286
1287
1288
1289
1290
1291
1292
1293
1294
1295
1296
1297
1298
1299
1300
1301
1302
1303
1304
1305
1306
1307
1308
1309
1310
1311
1312
1313
1314
1315
1316
1317
1318
1319
1320
1321
1322
1323
1324
1325
1326
1327
1328
1329
1330
1331
1332
1333
1334
1335
1336
1337
1338
1339
1340
1341
1342
1343
1344
1345
1346
1347
1348
1349
1350
1351
1352
1353
1354
1355
1356
1357
1358
1359
1360
1361
1362
1363
1364
1365
1366
1367
1368
1369
1370
1371
1372
1373
1374
1375
1376
1377
1378
1379
1380
1381
1382
1383
1384
1385
1386
1387
1388
1389
1390
1391
1392
1393
1394
1395
1396
1397
1398
1399
1400
1401
1402
1403
1404
1405
1406
1407
1408
1409
1410
1411
1412
1413
1414
1415
1416
1417
1418
1419
1420
1421
1422
1423
1424
1425
1426
1427
1428
1429
1430
1431
1432
1433
1434
1435
1436
1437
1438
1439
1440
1441
1442
1443
1444
1445
1446
1447
1448
1449
1450
1451
1452
1453
1454
1455
1456
1457
1458
1459
1460
1461
1462
1463
1464
1465
1466
1467
1468
1469
1470
1471
1472
1473
1474
1475
1476
1477
1478
1479
1480
1481
1482
1483
1484
1485
1486
1487
1488
1489
1490
1491
1492
1493
1494
1495
1496
1497
1498
1499
1500
1501
1502
1503
1504
1505
1506
1507
1508
1509
1510
1511
1512
1513
1514
1515
1516
1517
1518
1519
1520
1521
1522
1523
1524
1525
1526
1527
1528
1529
1530
1531
1532
1533
1534
1535
1536
1537
1538
1539
1540
1541
1542
1543
1544
1545
1546
1547
1548
1549
1550
1551
1552
1553
1554
1555
1556
1557
1558
1559
1560
1561
1562
1563
1564
1565
1566
1567
1568
1569
1570
1571
1572
1573
1574
1575
1576
1577
1578
1579
1580
1581
1582
1583
1584
1585
1586
1587
1588
1589
1590
1591
1592
1593
1594
1595
1596
1597
1598
1599
1600
1601
1602
1603
1604
1605
1606
1607
1608
1609
1610
1611
1612
1613
1614
1615
1616
1617
1618
1619
1620
1621
1622
1623
1624
1625
1626
1627
1628
1629
1630
1631
1632
1633
1634
1635
1636
1637
1638
1639
1640
1641
1642
1643
1644
1645
1646
1647
1648
1649
1650
1651
1652
1653
1654
1655
1656
1657
1658
1659
1660
1661
1662
1663
1664
1665
1666
1667
1668
1669
1670
1671
1672
1673
1674
1675
1676
1677
1678
1679
1680
1681
1682
1683
1684
1685
1686
1687
1688
1689
1690
1691
1692
1693
1694
1695
1696
1697
1698
1699
1700
1701
1702
1703
1704
1705
1706
1707
1708
1709
1710
1711
1712
1713
1714
1715
1716
1717
1718
1719
1720
1721
1722
1723
1724
1725
1726
1727
1728
1729
1730
1731
1732
1733
1734
1735
1736
1737
1738
1739
1740
1741
1742
1743
1744
1745
1746
1747
1748
1749
1750
1751
1752
1753
1754
1755
1756
1757
1758
1759
1760
1761
1762
1763
1764
1765
1766
1767
1768
1769
1770
1771
1772
1773
1774
1775
1776
1777
1778
1779
1780
1781
1782
1783
1784
1785
1786
1787
1788
1789
1790
1791
1792
1793
1794
1795
1796
1797
1798
1799
1800
1801
1802
1803
1804
1805
1806
1807
1808
1809
1810
1811
1812
1813
1814
1815
1816
1817
1818
1819
1820
1821
1822
1823
1824
1825
1826
1827
1828
1829
1830
1831
1832
1833
1834
1835
1836
1837
1838
1839
1840
1841
1842
1843
1844
1845
1846
1847
1848
1849
1850
1851
1852
1853
1854
1855
1856
1857
1858
1859
1860
1861
1862
1863
1864
1865
1866
1867
1868
1869
1870
1871
1872
1873
1874
1875
1876
1877
1878
1879
1880
1881
1882
1883
1884
1885
1886
1887
1888
1889
1890
1891
1892
1893
1894
1895
1896
1897
1898
1899
1900
1901
1902
1903
1904
1905
1906
1907
1908
1909
1910
1911
1912
1913
1914
1915
1916
1917
1918
1919
1920
1921
1922
1923
1924
1925
1926
1927
1928
1929
1930
1931
1932
1933
1934
1935
1936
1937
1938
1939
1940
1941
1942
1943
1944
1945
1946
1947
1948
1949
1950
1951
1952
1953
1954
1955
1956
1957
1958
1959
1960
1961
1962
1963
1964
1965
1966
1967
1968
1969
1970
1971
1972
1973
1974
1975
1976
1977
1978
1979
1980
1981
1982
1983
1984
1985
1986
1987
1988
1989
1990
1991
1992
1993
1994
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































                                                                                                                                                                        
WEBVTT

00:00:00.000 --> 00:00:03.559
...starting the recording here in the chat, and I see some

00:00:03.560 --> 00:00:06.039
questions already coming in. So thank you so much for your

00:00:06.040 --> 00:00:09.359
talk, Zac, and I'll step out of your way and let you field

00:00:09.360 --> 00:00:10.279
some of these questions.

00:00:10.280 --> 00:00:21.999
Sounds good. All right, so let's see. I'm going off of the

00:00:22.000 --> 00:00:22.969
question list.

NOTE Q: Do you think a reduced version of this functionality could be integrated into isearch?

00:00:22.970 --> 00:00:25.839
So the first one is about having reduced

00:00:25.840 --> 00:00:31.999
version of the functionality integrated into iSearch. So

00:00:32.000 --> 00:00:37.919
yeah, with the way things are set up, it is essentially a

00:00:37.920 --> 00:00:42.679
framework. So

00:00:42.680 --> 00:00:46.279
you can create a candidate. So just a review from the talk. So

00:00:46.280 --> 00:00:49.919
you have these candidate generators which generate search

00:00:49.920 --> 00:00:54.559
candidates. So you can have a file system candidate which

00:00:54.560 --> 00:00:58.519
generates these file documents, which have text content in

00:00:58.520 --> 00:01:01.799
them. In theory, you could have like a website candidate

00:01:01.800 --> 00:01:06.399
generator, and it could be like a web crawler. I mean, so

00:01:06.400 --> 00:01:10.519
there's a lot of different options. So one option, it's on my

00:01:10.520 --> 00:01:15.039
mind, and I hope to get to this soon, is create a defun, like a

00:01:15.040 --> 00:01:18.599
defun candidate generator. So basically it takes a file,

00:01:18.600 --> 00:01:22.279
splits it up into like defunds, kind of like just like what

00:01:22.280 --> 00:01:26.279
iSearch would do. and then use each of those, the body of

00:01:26.280 --> 00:01:30.959
those, as a content for the search session. So, I mean,

00:01:30.960 --> 00:01:35.359
essentially you could just, you could start up a session,

00:01:35.360 --> 00:01:39.479
and there's like programmatic ways to start these up too. So

00:01:39.480 --> 00:01:42.599
you could, if such a candidate generator was created, you

00:01:42.600 --> 00:01:49.559
could easily, and just like, you know, one command. Get the

00:01:49.560 --> 00:01:54.599
defunds, create a search session with it, and then just go

00:01:54.600 --> 00:02:01.439
straight to your query. So, definitely, something

00:02:01.440 --> 00:02:06.919
just like this is in the works. And I guess another thing is

00:02:06.920 --> 00:02:08.239
interface.

00:02:08.240 --> 00:02:17.079
The whole dedicated buffer is helpful for searching, but

00:02:17.080 --> 00:02:21.919
with this isearch case, there's currently not a way to have a

00:02:21.920 --> 00:02:27.839
reduced UI, where it's just like, OK, I have these function

00:02:27.840 --> 00:02:32.239
defuns for the current file. I just want them to pop up at the

00:02:32.240 --> 00:02:35.799
bottom so I can quickly go through it. So currently, I don't

00:02:35.800 --> 00:02:41.199
have that. But such a UI is definitely, yeah, thinking about

00:02:41.200 --> 00:02:45.359
how that could be done.

NOTE Q: Any idea how this would work with personal information like Zettlekastens?

00:02:45.360 --> 00:02:50.359
Alright, so yeah. So next question. Any idea how this

00:02:50.360 --> 00:02:52.599
will work with personal information like Zettelkasten?

00:02:52.600 --> 00:02:58.319
So this is, this is like, I mean, it's essentially usable as

00:02:58.320 --> 00:03:04.559
is with Zettelkasten method. So, I mean, that I mean

00:03:04.560 --> 00:03:08.279
basically what like for example org-roam, and I think other

00:03:08.280 --> 00:03:12.159
ones like Denote, they put all these files in the

00:03:12.160 --> 00:03:15.919
directory, and so with the already existing file system

00:03:15.920 --> 00:03:19.679
candidate generator all you'd have to do is set that to be the

00:03:19.680 --> 00:03:23.199
directory of your Zettelkasten system and then it would

00:03:23.200 --> 00:03:26.799
just pick up all the files in there and

00:03:26.800 --> 00:03:28.799
then add those as search candidates. So you could easily

00:03:28.800 --> 00:03:33.279
just search whatever system you have.

00:03:33.280 --> 00:03:36.039
Based off of the ways it's set up, if you had maybe your

00:03:36.040 --> 00:03:40.999
dailies you didn't want to search, it's just as easy to add a

00:03:41.000 --> 00:03:44.519
criteria saying, I don't want dailies to be searched. Like

00:03:44.520 --> 00:03:47.599
give, like just eliminate the date, like the things from the

00:03:47.600 --> 00:03:51.679
daily from the sub directory. And then there you go. you have

00:03:51.680 --> 00:03:57.799
your Zettelkasten search engine, and you could just copy

00:03:57.800 --> 00:03:59.999
the, you know, there's, I mean, I need, I'm working on

00:04:00.000 --> 00:04:03.519
documentation for this to kind of set this up easily, but,

00:04:03.520 --> 00:04:06.679
you know, you could just create your simple command, just

00:04:06.680 --> 00:04:10.679
like, your simple command, just like, just take in a text

00:04:10.680 --> 00:04:14.359
query, run it through the system, and then just get your

00:04:14.360 --> 00:04:19.599
search results right there. So yeah, definitely that is a

00:04:19.600 --> 00:04:22.040
use case that's on top of my mind.

NOTE Q: How good does the search work for synonyms especially if you use different languages?

00:04:22.041 --> 00:04:23.239
So next one, how good does a

00:04:23.240 --> 00:04:26.439
search work for synonyms, especially if you use different

00:04:26.440 --> 00:04:30.719
languages? Okay, this is a good question because with the

00:04:30.720 --> 00:04:34.719
way that VM25 works, it's essentially just like trying to

00:04:34.720 --> 00:04:41.119
find where terms occur and just counts them up.

00:04:41.120 --> 00:04:43.999
I mean, this is something I couldn't get into. There's just

00:04:44.000 --> 00:04:46.919
too much on the topic of information retrieval to kind of go

00:04:46.920 --> 00:04:52.879
into this, but there is a whole kind of field of just like, how

00:04:52.880 --> 00:04:58.279
do you, given a search term, how do you know what you should

00:04:58.280 --> 00:05:02.519
search for? So like popular kind of industrial search

00:05:02.520 --> 00:05:07.519
engines, like they have kind of this feature where you can

00:05:07.520 --> 00:05:11.039
like define synonyms, define, term replacement. So

00:05:11.040 --> 00:05:14.079
whenever you see this term, it should be this. And it even

00:05:14.080 --> 00:05:15.091
gets even further.

NOTE Plurals

00:05:15.092 --> 00:05:19.439
If someone searches for a plural string,

00:05:19.440 --> 00:05:22.279
how do you get the singular from that and search for that? So

00:05:22.280 --> 00:05:27.559
this is a huge topic that currently p-search doesn't

00:05:27.560 --> 00:05:33.519
address, but it's on the top of my mind as to how. So that's one

00:05:33.520 --> 00:05:33.882
part.

NOTE Different languages

00:05:33.883 --> 00:05:38.999
The next part is for different languages, one thing

00:05:39.000 --> 00:05:42.839
that kind of seems like it's promising is vector search,

00:05:42.840 --> 00:05:47.399
which, I mean, with the way p-search is set up, you could

00:05:47.400 --> 00:05:51.159
easily just create a vector search prior, plug it into the

00:05:51.160 --> 00:05:54.599
system, and start using it. The only problem is that kind of

00:05:54.600 --> 00:05:58.879
the vector search functions, like you have to do like cosine

00:05:58.880 --> 00:06:03.639
similarity, like if you have like 10,000 documents, If

00:06:03.640 --> 00:06:06.679
you're writing Elisp to calculate the cosine similarity

00:06:06.680 --> 00:06:09.879
between the vectors, that's going to be very slow. And so now

00:06:09.880 --> 00:06:14.159
the whole can of worms of indexing comes up. And how do you do

00:06:14.160 --> 00:06:17.479
that? And is that going to be native elisp? And so that's a

00:06:17.480 --> 00:06:21.839
whole other can of worms. So yeah, vector search seems

00:06:21.840 --> 00:06:25.959
promising. And then hopefully maybe other traditional

00:06:25.960 --> 00:06:33.439
synonyms, stemming, that kind of stuff for alternate

00:06:33.440 --> 00:06:40.199
terms, that could also be incorporated.

NOTE Q: When searching by author I know authors may setup a new machine and not put the exact same information. Is this doing anything to combine those into one author?

00:06:40.200 --> 00:06:43.719
Okay, next one. When searching by author, I know authors may

00:06:43.720 --> 00:06:47.119
set up a new machine and not put the exact same information.

00:06:47.120 --> 00:06:49.519
Is this doing anything to combine these two in one author?

00:06:49.520 --> 00:06:54.399
Okay, so for this one, it's not. So it's like the way the get

00:06:54.400 --> 00:06:58.119
prior is currently set up is that it just does like a get

00:06:58.120 --> 00:07:01.999
command to get all the get authors. You select one and then it

00:07:02.000 --> 00:07:07.959
just uses that. But the thing is, is if you knew the two emails

00:07:07.960 --> 00:07:12.519
that user might have used, the two usernames, you could just

00:07:12.520 --> 00:07:14.279
set up the

00:07:14.280 --> 00:07:19.799
two priors. One for the old user's email, and then just add

00:07:19.800 --> 00:07:24.079
another prior for the new user's email. And then that would

00:07:24.080 --> 00:07:29.279
be a way to just get both of those set up. So that's kind of a

00:07:29.280 --> 00:07:32.959
running theme throughout p-search is that It's made to be

00:07:32.960 --> 00:07:36.239
very flexible and very kind of like Lego block ish kind of

00:07:36.240 --> 00:07:39.959
like you can just, you know, if you need, you know, if

00:07:39.960 --> 00:07:41.919
something doesn't meet your needs, you know, it's easy to

00:07:41.920 --> 00:07:45.959
put pieces in, create new components of the search

00:07:45.960 --> 00:07:51.799
engine. Let's see, a cool powerful grep "Rak" to maybe have

00:07:51.800 --> 00:07:58.839
some good ideas. I have searches record code while

00:07:58.840 --> 00:08:04.039
searching. Okay. So. Okay, that's interesting. I'll have

00:08:04.040 --> 00:08:05.239
to look into this

00:08:05.240 --> 00:08:15.279
tool. I haven't seen that. I do kind of keep my eyes out for

00:08:15.280 --> 00:08:18.199
these kind of things. One thing I have seen that was kind of

00:08:18.200 --> 00:08:24.439
that, I mean, looked interesting was kind of like AST, like

00:08:24.440 --> 00:08:29.519
the treesitter, the treesitter grep tools. But like, you

00:08:29.520 --> 00:08:35.359
can grep for a string in the language itself. So that's

00:08:35.360 --> 00:08:37.959
something I think would be cool to implement either,

00:08:37.960 --> 00:08:41.359
because I mean, there's treesitter in Emacs, so it's

00:08:41.360 --> 00:08:44.519
possible to do a new list. If not, there are those kind of like

00:08:44.520 --> 00:08:47.719
treesitter. So that's, that's something that I think would

00:08:47.720 --> 00:08:50.719
be cool to incorporate.

NOTE Q: Have you thought about integrating results from using cosine similarity with a deep-learning based vector embedding?

00:08:50.720 --> 00:08:58.279
Let's see. Have you thought about integrating results from

00:08:58.280 --> 00:09:00.999
using cosine similarity with a deep learning based vector

00:09:01.000 --> 00:09:06.679
embedding? Yeah, exactly. So yeah, this kind of goes back to

00:09:06.680 --> 00:09:09.759
the topic before it. Definitely the whole semantic search

00:09:09.760 --> 00:09:12.679
with vector embeddings, that's something that, I mean, it

00:09:12.680 --> 00:09:15.479
would be actually kind of trivial to implement that in

00:09:15.480 --> 00:09:20.239
p-search. But like I said, computing the cosine similarity

00:09:20.240 --> 00:09:25.959
in elisp, it's probably too slow.

00:09:25.960 --> 00:09:34.879
And then also there's a whole question of how do you get the embeddings?

00:09:34.880 --> 00:09:36.919
Like, how do you get the system running locally on your

00:09:36.920 --> 00:09:41.239
machine if you want to run it that or, I mean, so that's

00:09:41.240 --> 00:09:48.879
actually another kind of aspect that I need to look into.

00:09:48.880 --> 00:10:01.939
Okay, so let's see.

NOTE Q: Is it possible to save/bookmark searches or search templates so they can be used again and again?

00:10:01.940 --> 00:10:06.319
Okay, next question. Let's see. I'm sorry if this has been

00:10:06.320 --> 00:10:09.079
covered. Is it possible to save/bookmark searches or search

00:10:09.080 --> 00:10:14.559
templates so they can be used again and again? Exactly. So

00:10:14.560 --> 00:10:18.199
just recently I added bookmarking capabilities. So

00:10:18.200 --> 00:10:21.119
you can essentially just bookmark whatever search session you

00:10:21.120 --> 00:10:26.359
have. And yeah, and it's just, it was just a bookmark. You can

00:10:26.360 --> 00:10:29.839
just open and just like reopen that, rerun that search from

00:10:29.840 --> 00:10:36.119
where you left off. So there's that. And then also, I tried to

00:10:36.120 --> 00:10:40.559
set this up so that there is a one-to-one mapping of a Lisp

00:10:40.560 --> 00:10:44.759
object to the search session. So from every search session

00:10:44.760 --> 00:10:49.519
you make, you should be able to get a, there's a command to do

00:10:49.520 --> 00:10:55.199
this, to get a data representation of the search. So it would

00:10:55.200 --> 00:11:00.079
just be like some plist. All you have to do is just take that

00:11:00.080 --> 00:11:04.479
plist, call this function p-search-setup-buffer with that

00:11:04.480 --> 00:11:09.119
data. And then that function should set up the session as you

00:11:09.120 --> 00:11:12.599
left off. So then like, you know, you could make your

00:11:12.600 --> 00:11:15.359
commands easy. You can make custom search commands super

00:11:15.360 --> 00:11:18.919
easy. You just get the data representation of that search,

00:11:18.920 --> 00:11:22.519
find what pieces you want the user to be able to, you know, the

00:11:22.520 --> 00:11:26.333
search term, make that a parameter in the

00:11:26.334 --> 00:11:29.079
command, in the interactive code. So you'd have like

00:11:29.080 --> 00:11:31.906
print on top and then there you go. You have,

00:11:31.907 --> 00:11:34.327
you have a command to do the search

00:11:34.328 --> 00:11:35.759
just like just right there. So, so

00:11:35.760 --> 00:11:38.519
there's a lot of those things and there's a lot more that

00:11:38.520 --> 00:11:40.999
could be done. Like maybe having, you know, there's kind of

00:11:41.000 --> 00:11:45.479
in the works and like thinking about having groups of groups

00:11:45.480 --> 00:11:48.959
of these things, like maybe you can set up like, Oh, I always

00:11:48.960 --> 00:11:51.919
add these three criteria together. So I, you know, maybe I

00:11:51.920 --> 00:11:54.559
can make a preset out of these and make them easy, easily

00:11:54.560 --> 00:11:58.079
addable. So yeah. A lot of things like that are, you know, I'm

00:11:58.080 --> 00:12:02.799
thinking about a lot of things about that, so.

NOTE Q: You mentioned about candidate generators. Could you explain about to what the score is assigned to?

00:12:02.800 --> 00:12:06.079
Okay, so next question. You mentioned about candidate

00:12:06.080 --> 00:12:08.479
generators. Could you explain about what the score is

00:12:08.480 --> 00:12:12.199
assigned to? Is this to a line or whatever the candidate

00:12:12.200 --> 00:12:17.079
generates? How does it work with our junior demo? Okay,

00:12:17.080 --> 00:12:21.799
yeah, so this is a, this is, so actually I had to implement, I

00:12:21.800 --> 00:12:26.719
had to rewrite p-search just to get this part right. So the

00:12:26.720 --> 00:12:31.159
candidate generator generates documents. Documents have

00:12:31.160 --> 00:12:36.919
properties. So the most notable property is the content

00:12:36.920 --> 00:12:40.599
property. So essentially what happens is that when you

00:12:40.600 --> 00:12:42.879
create a file system candidate generator and give it a

00:12:42.880 --> 00:12:45.919
directory, the code goes into the directory, kind of

00:12:45.920 --> 00:12:49.079
recursively goes through all the directories, and

00:12:49.080 --> 00:12:51.559
generates a candidate, which is just like a simple list

00:12:51.560 --> 00:12:55.679
form. It's saying, this is a file, the file path is this. So

00:12:55.680 --> 00:13:00.799
that's the document ID. So this is saying, this is a file,

00:13:00.800 --> 00:13:05.559
it's a file, and its file path is this. And so from that, you

00:13:05.560 --> 00:13:09.279
get all of the different properties, the sub properties. If

00:13:09.280 --> 00:13:11.719
you're given that, you know how to get the content. If you're

00:13:11.720 --> 00:13:15.439
given that, you know how to... So all these properties come

00:13:15.440 --> 00:13:18.839
out. And then also the candidate generator is the thing that

00:13:18.840 --> 00:13:25.439
knows how best to search for the terms. So for example, there

00:13:25.440 --> 00:13:29.159
is a buffer candidate generator. What that does is it just

00:13:29.160 --> 00:13:34.759
puts all your buffers as search candidates. So obviously

00:13:34.760 --> 00:13:37.879
you can't, you can't run ripgrep on buffers like you can't you

00:13:37.880 --> 00:13:41.759
can't do that, you can't run ripgrep on just like yeah just

00:13:41.760 --> 00:13:44.319
just like buffers that don't have files attached or, for

00:13:44.320 --> 00:13:47.559
example, maybe there's like an internet search candidate

00:13:47.560 --> 00:13:51.279
generator, like a web crawler thing. You just imagine it

00:13:51.280 --> 00:13:55.759
goes to a website, kind of crawls all the links and all that,

00:13:55.760 --> 00:13:58.119
and then just gets your web pages for the candidates.

00:13:58.120 --> 00:14:01.159
Obviously, you can't use ripgrep for that either. So, every

00:14:01.160 --> 00:14:04.679
candidate generator knows how best to search for the terms

00:14:04.680 --> 00:14:08.919
of what candidate it's generating. So, the file system

00:14:08.920 --> 00:14:12.359
candidate generator will say, okay, I have a base

00:14:12.360 --> 00:14:17.239
directory. So, if you ask me, the file system candidate

00:14:17.240 --> 00:14:21.239
generator, how to get the terms, it knows it's set up to use

00:14:21.240 --> 00:14:25.199
ripgrep. And so, it runs ripgrep, and so then it goes

00:14:25.200 --> 00:14:29.439
through, it runs the command, gets the counts, and then

00:14:29.440 --> 00:14:32.359
store those counts. So, the lines have nothing. At this

00:14:32.360 --> 00:14:35.999
point, the lines have nothing. There's no notion of lines at

00:14:36.000 --> 00:14:40.559
all. It's just document, document ID with the amount of

00:14:40.560 --> 00:14:43.839
times it matched. And that's all you need to run this BM25

00:14:43.840 --> 00:14:47.519
algorithm. But then when you get the top results, you

00:14:47.520 --> 00:14:51.359
obviously want to see the lines that matched. And so there's

00:14:51.360 --> 00:14:56.399
another thing, another method to kind of get the exact

00:14:56.400 --> 00:15:00.559
thing, to kind of match out the particular lines. And so

00:15:00.560 --> 00:15:03.159
that's a separate mechanism. And that can be done in Elist,

00:15:03.160 --> 00:15:05.719
because if you're not displaying, that's kind of a design

00:15:05.720 --> 00:15:09.319
decision of P-Search, is that it only displays like maybe 10

00:15:09.320 --> 00:15:12.519
or 20. It doesn't display all the results. So you can have

00:15:12.520 --> 00:15:16.679
Elist just go crazy with just like highlighting things,

00:15:16.680 --> 00:15:22.719
picking the best kind of pieces to show. So yeah, that's how

00:15:22.720 --> 00:15:27.359
that's set up.

00:15:27.360 --> 00:15:38.279
So, here's perhaps a good moment for me to just jump in and

00:15:38.280 --> 00:15:42.079
comment that in a minute or so we will break away with the live

00:15:42.080 --> 00:15:47.439
stream to give people an hour of less content to make sure

00:15:47.440 --> 00:15:50.639
everybody goes and takes their lunch and break a little bit.

00:15:50.640 --> 00:15:55.039
But if you would like to keep going in here, Love to love to

00:15:55.040 --> 00:15:59.839
take as many questions. And, of course, we will include

00:15:59.840 --> 00:16:06.159
that all when we publish the Q and A. Sounds good. Yeah, I'll go

00:16:06.160 --> 00:16:12.199
and stick around on the stream as we cut away, as we've got a

00:16:12.200 --> 00:16:15.999
little video surprise we've all prepared to play, just some

00:16:16.000 --> 00:16:19.359
comments from an Emacs user dated in 2020 or something like

00:16:19.360 --> 00:16:29.679
this. I forget the detail. Thank you again so much, Zac, for

00:16:29.680 --> 00:16:30.959
your fascinating talk.

00:16:30.960 --> 00:16:32.301
Yeah, so, okay.

NOTE Q: easy filtering with orderless - did this or something like this help or infulce the design of psearch?

00:16:32.302 --> 00:16:33.359
This makes me really think about the

00:16:33.360 --> 00:16:35.999
emergent workflows with Denote and easy filtering with

00:16:36.000 --> 00:16:36.639
orderless.

00:16:36.640 --> 00:16:42.039
Did this or something like this help influence the design of

00:16:42.040 --> 00:16:47.359
p-search? Yeah, exactly. So, I mean, yeah, I mean, there's

00:16:47.360 --> 00:16:49.919
just so many different searches. Like, it's just kind of

00:16:49.920 --> 00:16:52.519
mind-boggling. Like, you could search for whatever you want

00:16:52.520 --> 00:16:54.599
on your computer. Like, there's just so much, like, you

00:16:54.600 --> 00:17:01.199
can't, yeah, you can't just like, you can't just like hard

00:17:01.200 --> 00:17:04.159
code any of these things. It's all malleable. Like maybe

00:17:04.160 --> 00:17:09.279
somebody wants to search these directories. And so, yeah,

00:17:09.280 --> 00:17:10.639
like

00:17:10.640 --> 00:17:18.399
exactly like that use case of having a directory of files

00:17:18.400 --> 00:17:18.959
where

00:17:18.960 --> 00:17:25.919
they contain your personal knowledge management system.

00:17:25.920 --> 00:17:33.479
Yeah, that use case definitely was at the top of my mind.

00:17:33.480 --> 00:17:35.879
Let's see.

00:17:35.880 --> 00:17:56.959
Let's see, so Git covers the multiple names thing itself.

NOTE Q: Notmuch with the p-search UI

00:17:56.960 --> 00:18:00.359
Okay, yeah,

00:18:00.360 --> 00:18:09.599
so something about notmuch with p-search UI. Actually,

00:18:09.600 --> 00:18:16.399
interestingly, I think notmuch is, I haven't used it

00:18:16.400 --> 00:18:22.759
myself, but that's the, email something about yeah so i mean

00:18:22.760 --> 00:18:25.679
this is like these things are just like these these kind of

00:18:25.680 --> 00:18:30.479
extensions could kind of go go forever but one thing i

00:18:30.480 --> 00:18:33.369
thought about is like i use mu4e for email

00:18:33.370 --> 00:18:41.119
and that uses a full-fledged index. And so having

00:18:41.120 --> 00:18:44.879
some method to kind of reach into these different systems

00:18:44.880 --> 00:18:47.938
and kind of be kind of like a front end for this.

00:18:47.939 --> 00:18:52.000
Another thing is maybe SQL database.

00:18:52.001 --> 00:18:55.823
You can create a candidate generator from a SQLite query

00:18:55.824 --> 00:19:01.919
and then... yeah...

00:19:02.583 --> 00:19:05.519
I've had tons of ideas of different things you could

00:19:05.520 --> 00:19:09.559
incorporate into the system. Slowly,

00:19:09.560 --> 00:19:13.599
they're being implemented. Just recently, I implemented

NOTE Info

00:19:13.600 --> 00:19:17.039
an info file candidate generator. So it lists out all the

00:19:17.040 --> 00:19:21.559
info files, and then it creates a candidate for each of the

00:19:21.560 --> 00:19:26.759
info nodes. So it turns out, yeah, I mean, it works pretty, I

00:19:26.760 --> 00:19:32.559
mean, just as well as Google. So I'm up for my own testing.

00:19:32.560 --> 00:19:39.999
Let's see, you can search a buffer using ripgrep feeding in

00:19:40.000 --> 00:19:44.759
as standard in to the ripgrep process, can't you? Yep, yeah,

00:19:44.760 --> 00:19:50.039
you can definitely search a buffer that way. So, yeah, I

00:19:50.040 --> 00:19:56.359
mean, based off of I mean, if this, yeah, so one thing that

00:19:56.360 --> 00:19:59.039
came up is that the system wants, I mean, I wanted the system

00:19:59.040 --> 00:20:03.559
to be able to search a lot of different things. And so it came

00:20:03.560 --> 00:20:05.999
up that I had, you know, implementing,

00:20:06.000 --> 00:20:10.159
doing these search things, having an Elist

00:20:10.160 --> 00:20:13.079
implementation, despite it being slow, would be

00:20:13.080 --> 00:20:17.399
necessary. So like anything that isn't represented as a

00:20:17.400 --> 00:20:21.639
file, Elisp, there's a mechanism in p-search to search for

00:20:21.640 --> 00:20:23.319
it.

00:20:23.320 --> 00:20:29.719
So, yeah, so having that redundancy kind of lets you get into

00:20:29.720 --> 00:20:32.799
the, you know, using kind of ripgrep for the big scale

00:20:32.800 --> 00:20:37.759
things. But then when you get to the individual file, you

00:20:37.760 --> 00:20:40.999
know, just going back to Elisp to kind of get the finer

00:20:41.000 --> 00:20:47.199
details seems to, you know, seems to end up working pretty

00:20:47.200 --> 00:21:04.239
well.

00:21:04.240 --> 00:21:27.399
Thank you all for listening. Yeah, sounds like we're about

00:21:27.400 --> 00:21:31.279
out of questions. Hi, Zacc. I have a question or still a

00:21:31.280 --> 00:21:34.119
question. I just want to thank everybody one more time for

00:21:34.120 --> 00:21:37.719
their participation, especially you for speaking, Zack. I

00:21:37.720 --> 00:21:41.239
look forward to playing with p-search myself. Thank you.

00:21:41.240 --> 00:21:44.039
Yeah, there might be one last question. Is there someone?

00:21:44.040 --> 00:21:48.519
Yes, there is. I don't know if you can understand me, but

00:21:48.520 --> 00:21:50.359
thank you for making this lovely thing

00:21:50.360 --> 00:21:57.919
I feel inspired to try it out and I'm thinking about how to

00:21:57.920 --> 00:22:04.199
integrate it because it sounds modular and nicely thought

00:22:04.200 --> 00:22:09.799
out. One small question. Have you thought about Project L

00:22:09.800 --> 00:22:13.719
integration? And then I have a little bigger question about

00:22:13.720 --> 00:22:14.879
the interface.

NOTE project.el integration

00:22:14.880 --> 00:22:20.799
Yeah, project.el integration, it's used in a couple of ways.

00:22:20.800 --> 00:22:25.719
It's kind of used to kind of as like kind of like a default.

00:22:25.720 --> 00:22:31.279
This is the directory I want to search for the default

00:22:31.280 --> 00:22:33.639
p-search command. It does, yeah, it kind of goes off of

00:22:33.640 --> 00:22:37.119
project.el. If there is a project, it kind of says, okay, this,

00:22:37.120 --> 00:22:40.319
I want to search this project. And so it kind of, it used that

00:22:40.320 --> 00:22:46.119
as a default. So there's that. Because I use the project-grep

00:22:46.120 --> 00:22:50.679
or git-grep search a lot and maybe this is a better solution to

00:22:50.680 --> 00:22:55.319
the search and the interface you have right now for the

00:22:55.320 --> 00:22:56.476
search results.

NOTE Q: How happy are you with the interface?

00:22:56.477 --> 00:22:58.719
How happy are you with it and have you

00:22:58.720 --> 00:23:02.599
thought about improving or have you ideas for

00:23:02.600 --> 00:23:06.639
improvements? Yeah, well actually what you see in the demo

00:23:06.640 --> 00:23:09.199
in the video isn't... There's actually, there is an

00:23:09.200 --> 00:23:13.959
improvement in the current code. Basically, what it

00:23:13.960 --> 00:23:17.239
does is it scans there's the current default as it scans

00:23:17.240 --> 00:23:20.054
the entire file for all of the searches.

00:23:20.055 --> 00:23:25.959
It finds the window that that has the highest score. So it kind

00:23:25.960 --> 00:23:29.599
of goes through entire file and just says... And it kind of finds

00:23:29.600 --> 00:23:33.479
like the piece of the section of text that has the most

00:23:33.480 --> 00:23:37.919
matches with the terms that score the best. So it's, I mean,

00:23:37.920 --> 00:23:40.119
that section is pretty good. I mean, that, so yeah, that,

00:23:40.120 --> 00:23:44.519
that ends up working pretty well. So I mean, in terms of other

00:23:44.520 --> 00:23:46.879
UI stuff, there's, there's tons, there's tons more that

00:23:46.880 --> 00:23:50.159
could be done, like, especially like debug ability or like

00:23:50.160 --> 00:23:53.799
introspection. Like, so this, this result, like, for

00:23:53.800 --> 00:23:57.119
example, this result ranks really high. Maybe you don't

00:23:57.120 --> 00:24:01.719
know why though. It's like, because of this, this text query

00:24:01.720 --> 00:24:04.479
arrow, was it because of this criteria? I think

00:24:04.480 --> 00:24:09.039
there's some UI elements that could kind of help the user

00:24:09.040 --> 00:24:12.519
understand why results are scoring high or low. So that's

00:24:12.520 --> 00:24:15.639
definitely... And that makes a lot of sense to me. You know, a

00:24:15.640 --> 00:24:19.039
lot of it is demystifying, like understanding what you're

00:24:19.040 --> 00:24:22.719
learning better and not just finding the right thing. A lot

00:24:22.720 --> 00:24:26.519
of it is, you know, kind of exploring your data. I love that.

00:24:26.520 --> 00:24:31.639
Thanks. Okay. I'm not trying to hurry us through either by

00:24:31.640 --> 00:24:36.599
any stretch. I would be happy to see this be a conversation.

00:24:36.600 --> 00:24:42.359
I also want to be considerate of your time. And I also wanted to

00:24:42.360 --> 00:24:45.479
make a quick shout out to everybody who's been updating and

00:24:45.480 --> 00:24:50.479
helping us capture the questions and the comments and the

00:24:50.480 --> 00:24:53.639
etherpad. That's just a big help to the extent that people

00:24:53.640 --> 00:24:57.199
are jumping in there and you know, revising and extending

00:24:57.200 --> 00:24:59.799
and just doing the best job we can to capture all the

00:24:59.800 --> 00:25:00.799
thoughtful remarks.

00:25:00.800 --> 00:25:14.839
Yeah, thank you, Zac. I'm not too sure what to ask anymore,

00:25:14.840 --> 00:25:20.559
but yes, would love to try it out now. Yeah, I mean,

00:25:20.560 --> 00:25:22.076
definitely feel free to...

00:25:22.077 --> 00:25:25.679
any feedback, here's my mail, or issues...

00:25:25.680 --> 00:25:29.039
I mean I'm happy to get any any feedback. It's

00:25:29.040 --> 00:25:31.679
still in the early stages, so still kind of a lot of

00:25:31.680 --> 00:25:35.599
documentation that needs to be writing. There's a lot.

00:25:35.600 --> 00:25:38.439
There's a lot on the roadmap, but yeah, I mean, hopefully, I

00:25:38.440 --> 00:25:42.759
could even publish this to ELPA and have a nice

00:25:42.760 --> 00:25:47.727
manual so yeah hopefully yeah those come soon. Epic.

00:25:47.728 --> 00:25:50.279
That sounds great, yes.

NOTE gptel

00:25:50.280 --> 00:25:59.359
The ability to save your searches kind of reminds me of like

00:25:59.360 --> 00:26:05.119
the gptel package for the AI, where you can save searches,

00:26:05.120 --> 00:26:10.799
which makes it feel a lot more different. And yeah, we don't

00:26:10.800 --> 00:26:14.839
have something for that with search, but yeah, that's a

00:26:14.840 --> 00:26:19.279
whole different dynamic where it's like, okay, yeah, and

00:26:19.280 --> 00:26:24.679
makes it a unique tool that is, I guess would be unique to

00:26:24.680 --> 00:26:28.079
Emacs where you don't see that with like this AI package

00:26:28.080 --> 00:26:31.119
where the gptel is kind of unique because it's not just throw

00:26:31.120 --> 00:26:37.039
away. It's how did I get this? How did I search for it? And be an

00:26:37.040 --> 00:26:40.319
organic search, kind of like the orderless and vertico

00:26:40.320 --> 00:26:43.039
and...

00:26:43.040 --> 00:26:46.279
Yeah, that's a good, I mean, that brings me to another thing

00:26:46.280 --> 00:26:48.239
in that, so,

00:26:48.240 --> 00:26:53.199
I mean, you could easily...

00:26:53.200 --> 00:26:57.399
you could create bridges from p-search to these different

00:26:57.400 --> 00:27:01.519
other packages, like, for example, kind of a RAG search,

00:27:01.520 --> 00:27:04.679
like there's this RAG, there's this thing called a RAG

00:27:04.680 --> 00:27:06.879
workflow, which is kind of popular these days. It's like

00:27:06.880 --> 00:27:11.639
retrieval augmented generation. So, you do a search and

00:27:11.640 --> 00:27:14.199
then based off the search results you get, then you pass

00:27:14.200 --> 00:27:20.359
those into LLM. So, the cool thing is that like you could use

00:27:20.360 --> 00:27:25.119
p-search for the retrieval. And so you could even like, I

00:27:25.120 --> 00:27:28.799
mean, you could even ask an LM to come up with the search terms

00:27:28.800 --> 00:27:32.079
and then have it search. There's no

00:27:32.080 --> 00:27:35.439
programmatical interface now to do this exact workflow.

00:27:35.440 --> 00:27:39.039
But I mean, there's another kind of direction I'm starting

00:27:39.040 --> 00:27:43.199
to think about. So like you could have maybe

00:27:43.200 --> 00:27:47.759
a question answer kind of workflow where it does

00:27:47.760 --> 00:27:51.639
like an initial search for the terms and then you get the top

00:27:51.640 --> 00:27:57.199
results and then you can put that through maybe gptel or all

00:27:57.200 --> 00:27:59.759
these other different systems. So that's, and that seems

00:27:59.760 --> 00:28:01.479
like a promising thing. And then another thing is like,

NOTE Saving a search

00:28:01.480 --> 00:28:10.594
well, you mentioned the ability to save a search.

00:28:10.595 --> 00:28:11.479
One thing I've noticed

00:28:11.480 --> 00:28:15.359
kind of like with the DevOps workflows is, I'll write a

00:28:15.360 --> 00:28:20.519
CLI command that I do, or like a calculator command. Then I end

00:28:20.520 --> 00:28:23.999
up in the org mode document, write what I wrote, had the

00:28:24.000 --> 00:28:26.943
results in there, and then I'll go back to that.

00:28:26.944 --> 00:28:31.966
It's like, oh, this is why, this is that calculation I did

00:28:31.967 --> 00:28:34.007
and this is why I did it.

00:28:34.008 --> 00:28:36.959
I'll have run the same tool three different

00:28:36.960 --> 00:28:40.519
times to get three different answers, if it was like a

00:28:40.520 --> 00:28:41.799
calculator, for example.

NOTE Workflows

00:28:41.800 --> 00:28:49.319
But yeah, that's a very unique feature that isn't seen and

00:28:49.320 --> 00:28:53.959
will make me look at it and see about integrating it into my

00:28:53.960 --> 00:28:59.079
workflow. Yeah, I think you get on some interesting, you

00:28:59.080 --> 00:29:03.159
know, kind of what makes Emacs really unique there and how

00:29:03.160 --> 00:29:07.399
to... interesting kind of ways to exploit

00:29:07.400 --> 00:29:12.439
Emacs to learn in the problem. I'm seeing a number of

00:29:12.440 --> 00:29:15.799
ways you're getting at that. For example, if I think about

00:29:15.800 --> 00:29:18.999
like an automation workflow, and there's just a million

00:29:19.000 --> 00:29:22.719
we'll say, assumptions that are baked into a search

00:29:22.720 --> 00:29:26.719
product, so to speak, like represented by a Google search or

00:29:26.720 --> 00:29:31.639
Bing or what have you. And then as I unpack that and repack it

00:29:31.640 --> 00:29:35.159
from an Emacs workflow standpoint, thinking about, well,

00:29:35.160 --> 00:29:39.079
first of all, what is the yak I'm shaving? And then also, what

00:29:39.080 --> 00:29:43.759
does doing it right mean? How would I reuse this? How would I

00:29:43.760 --> 00:29:47.679
make the code accessible to others for their own purposes in

00:29:47.680 --> 00:29:52.439
a free software world kind of way? and all of the different

00:29:52.440 --> 00:29:57.479
sort of say like orthogonal headspacey kind of things,

00:29:57.480 --> 00:30:00.079
right? Emacs brings a lot to the table from a search

00:30:00.080 --> 00:30:03.719
standpoint because I'm going to want to think about. I'm

00:30:03.720 --> 00:30:07.799
going to want to think about where does the UI come in? Where

00:30:07.800 --> 00:30:11.399
might the user want to get involved interactively? Where

00:30:11.400 --> 00:30:14.359
might the user want to get involved declaratively with

00:30:14.360 --> 00:30:16.919
their configuration, perhaps based on the particular

00:30:16.920 --> 00:30:21.359
environment where this Emacs is running? And there's just a

00:30:21.360 --> 00:30:24.879
lot of what Emacs users think about that really applies.

00:30:24.880 --> 00:30:28.359
I'll use the word again, orthogonally across all my many

00:30:28.360 --> 00:30:33.239
workflows as an Emacs user. You know, the search is just such

00:30:33.240 --> 00:30:38.519
a big word. Yeah, that's actually, this exact point I was

00:30:38.520 --> 00:30:43.159
thinking about with this. It's like, I mean, it seems kind of

00:30:43.160 --> 00:30:46.319
obvious, like just like using grep or something, just like to

00:30:46.320 --> 00:30:49.359
get search counts, like, okay, you can just run the command,

00:30:49.360 --> 00:30:51.439
get the term counts and you could just run it through a

00:30:51.440 --> 00:30:55.959
relatively simple algorithm. to get your search score. So

00:30:55.960 --> 00:31:01.759
if it's this easy, though, why don't we see this in other... And

00:31:01.760 --> 00:31:06.919
the results are actually surprisingly good. So why don't we

00:31:06.920 --> 00:31:10.559
see this anywhere, really? And it occurred to me that just

00:31:10.560 --> 00:31:16.399
the amount of configuration... The amount of setup you have to

00:31:16.400 --> 00:31:20.039
do to get it right.

00:31:20.040 --> 00:31:24.599
It's above this threshold that you need something like

00:31:24.600 --> 00:31:27.856
Emacs to kind of get pushed through that configuration.

NOTE Transient and configuration

00:31:27.857 --> 00:31:30.799
So for example, that's why I rely heavily on transient

00:31:30.800 --> 00:31:34.119
to set up the system. 'Cause like, if you want to get good

00:31:34.120 --> 00:31:36.079
search results, you're going to have to configure a lot

00:31:36.080 --> 00:31:38.519
of stuff. I want this directory. I want this, I don't

00:31:38.520 --> 00:31:41.559
want this directory. I want these search terms, you know,

00:31:41.560 --> 00:31:48.159
there's a lot to set up. And in most programs, I mean, they

00:31:48.160 --> 00:31:52.079
don't have an easy way to, I mean, they'll often try and try to

00:31:52.080 --> 00:31:55.039
hide all this complexity. Like they say, okay, our users

00:31:55.040 --> 00:31:59.199
too, you know, we don't want to, you know, we don't wanna, you

00:31:59.200 --> 00:32:02.719
know, make our users, we don't wanna scare our users with

00:32:02.720 --> 00:32:06.879
like, complicated search engine configuration. So we're

00:32:06.880 --> 00:32:09.079
just going to do it all in the background and we're just not

00:32:09.080 --> 00:32:12.599
going to let the user even know that it's happening. I mean,

00:32:12.600 --> 00:32:15.119
that's the third time you've made me laugh out loud. Sorry

00:32:15.120 --> 00:32:17.879
for interrupting you, but yeah, you're just spot on there.

00:32:17.880 --> 00:32:22.999
You're some people's users. Am I right? like, you know, and

00:32:23.000 --> 00:32:25.390
also some people's workflows.

NOTE Problem space

00:32:25.391 --> 00:32:27.719
And, you know, another case

00:32:27.720 --> 00:32:30.799
where just like, if you're thinking about Emacs, you either

00:32:30.800 --> 00:32:33.279
have to pick a tunnel to dive into and be like, no, this is

00:32:33.280 --> 00:32:37.759
going to be right for my work, or your problem space is never

00:32:37.760 --> 00:32:40.879
ending in terms of discovering the ways other people are

00:32:40.880 --> 00:32:45.839
using Emacs and how that breaks your feature. and how that

00:32:45.840 --> 00:32:49.679
breaks your conceptualization of the problem space,

00:32:49.680 --> 00:32:53.559
right? Or you just have to get so narrowed down that can

00:32:53.560 --> 00:32:57.119
actually be hard to find people that are quite understand

00:32:57.120 --> 00:33:00.279
you, right? You get into the particular, well, it solves

00:33:00.280 --> 00:33:03.039
these three problems for me. Well, what are these three

00:33:03.040 --> 00:33:08.639
problems again? And this is a month to unpack. You have Emacs

00:33:08.640 --> 00:33:12.639
and I don't know, it's like you got a lot of, they all agree is

00:33:12.640 --> 00:33:16.559
like we're going to use elisp to set variables every emacs

00:33:16.560 --> 00:33:21.199
package is going to do that we're going to use elisp and have a

00:33:21.200 --> 00:33:25.479
search in place to put our documentation and like it does

00:33:25.480 --> 00:33:32.559
also eliminate a lot of confusion and gives a lot of

00:33:32.560 --> 00:33:37.719
expectations of what they want. One thing that I'm

00:33:37.720 --> 00:33:39.855
surprised I haven't seen elsewhere is you have the

NOTE consult-omni

00:33:39.856 --> 00:33:44.239
consult-omni package which allows you to search multiple websites

00:33:44.240 --> 00:33:49.799
simultaneously for multiple web search engines. and put

00:33:49.800 --> 00:33:52.799
them in one thing and it's like, and then you use orderless.

NOTE orderless

00:33:52.800 --> 00:33:55.159
Why would you use orderless? Because that's what you

00:33:55.160 --> 00:33:57.799
configured and you know exactly what you wanna use and you

00:33:57.800 --> 00:34:01.679
use the same font and your same mini buffer and you use all

00:34:01.680 --> 00:34:04.079
that existing configuration because, well, you're an

00:34:04.080 --> 00:34:07.599
Emacs user or like you're a command line user. You know how

00:34:07.600 --> 00:34:11.559
you want these applications to go. You don't want them to be

00:34:11.560 --> 00:34:17.399
reinvented the wheel 1600 times in 1,600 different ways,

00:34:17.400 --> 00:34:23.079
you want it to use your mini buffer, your font, your et

00:34:23.080 --> 00:34:28.159
cetera, et cetera, et cetera. But I haven't

00:34:28.160 --> 00:34:32.479
seen a website where I can search multiple websites at the

00:34:32.480 --> 00:34:35.159
same time in something like Emacs before. And it's like,

00:34:35.160 --> 00:34:38.319
yeah, with my sorting algorithm,

00:34:38.320 --> 00:34:49.359
Yeah, exactly. Yeah. Yeah. Yeah. I mean, just setting the

00:34:49.360 --> 00:34:57.079
bar for configuration and set up just like, yeah, you have to

00:34:57.080 --> 00:35:02.839
have a list. Yeah. I mean, it, it does, obviously it's not,

00:35:02.840 --> 00:35:05.839
it's not most beginner beginner friendly, but I mean, it,

00:35:05.840 --> 00:35:10.319
yeah, it definitely widens the amount of the solution space

00:35:10.320 --> 00:35:14.679
you can have to such problems. Oh my gosh, you used the word

00:35:14.680 --> 00:35:18.759
solution space. I love it. But on the flip side, it's like,

00:35:18.760 --> 00:35:25.119
why does Emacs get this consult-omni package? Or let's see,

00:35:25.120 --> 00:35:30.719
you have elfeed-youtube where it will put a flowing

00:35:30.720 --> 00:35:34.479
transcript on a YouTube video or you got your package. Why

00:35:34.480 --> 00:35:39.879
does it get all these applications? And I don't see

00:35:39.880 --> 00:35:45.679
applications like this as much outside of Emacs. So there's

00:35:45.680 --> 00:35:46.267
a way that it just makes it easier.

NOTE User interface

00:35:46.268 --> 00:35:47.479
It's because user

00:35:47.480 --> 00:35:51.439
interface is the, you know, it's the economy stupid of

00:35:51.440 --> 00:35:58.119
technology, right? If you grab people by the UX, you can sell

00:35:58.120 --> 00:36:01.679
a million of any product that solves problem that I didn't

00:36:01.680 --> 00:36:04.639
think technology could solve, or that I didn't think I had

00:36:04.640 --> 00:36:08.319
the patience to use technology to solve, which is a lot of

00:36:08.320 --> 00:36:12.159
times what it comes down to. And here exactly is the, you

00:36:12.160 --> 00:36:16.799
know, the the Emacs sort of conundrum, right? How much time

00:36:16.800 --> 00:36:20.759
should I spend today updating my Emacs so that tomorrow I can

00:36:20.760 --> 00:36:26.319
just work more, right? And, you know, I love that little

00:36:26.320 --> 00:36:29.839
graph of the Emacs learning curve, right? Where it's this

00:36:29.840 --> 00:36:33.399
concentric, it becomes this concentric spiral, right? The

00:36:33.400 --> 00:36:38.759
Vim learning curve is like a ladder, right? Or, you know, and

00:36:38.760 --> 00:36:44.119
And the nano learning curve is like just a flat plane, you

00:36:44.120 --> 00:36:49.279
know, or a ladder, a vertical ladder or a horizontal ladder.

00:36:49.280 --> 00:36:56.719
There we go. And the Emacs learning curve is this kind of

00:36:56.720 --> 00:36:59.799
straight up line until it curves back on itself and

00:36:59.800 --> 00:37:03.079
eventually spirals. And the more you learn, the harder it is

00:37:03.080 --> 00:37:05.839
to learn the next thing. And are you really moving forward at

00:37:05.840 --> 00:37:09.039
all? Like, it just works for me. What a great analogy. And

00:37:09.040 --> 00:37:15.279
that's my answer, I think. Yeah. You know, it's because

00:37:15.280 --> 00:37:20.199
we... The spiral is great. Sorry. There are each of these

00:37:20.200 --> 00:37:26.639
weird little packages that some of us, you know, it solves

00:37:26.640 --> 00:37:29.279
that one problem and lets us get back to work. And for others,

00:37:29.280 --> 00:37:32.439
it makes us go, gosh, now that makes me rethink a whole bunch

00:37:32.440 --> 00:37:35.239
of things because there's... Like I don't even know what

00:37:35.240 --> 00:37:37.719
you're talking about with some of your conceptualizations

00:37:37.720 --> 00:37:41.039
of UI. Maybe it comes from Visual Studio, and I've not

00:37:41.040 --> 00:37:44.679
used that or something. So for you, it's a perfectly normal UX

00:37:44.680 --> 00:37:48.799
paradigm that you kind of lean on for others. It's like you

00:37:48.800 --> 00:37:51.999
know occupying some screen space and I don't know what the

00:37:52.000 --> 00:37:57.759
gadgets do and when I open them up... They're thinking

00:37:57.760 --> 00:38:00.999
about... they have... they imply their own

00:38:01.000 --> 00:38:03.639
abstractions let's say logically against a programming

00:38:03.640 --> 00:38:06.999
language. This would be tree sitter, right. If i'm not used to

00:38:07.000 --> 00:38:11.719
thinking in terms of an abstract abstract syntax tree, some

00:38:11.720 --> 00:38:14.799
of the concepts just aren't as natural for me. If i'm used to

00:38:14.800 --> 00:38:19.039
like emacs at a more fundamental level is, or the old modes

00:38:19.040 --> 00:38:23.479
right, we're used to them thinking in terms of progressing

00:38:23.480 --> 00:38:26.959
forward through some text, managing a stack of markers into

00:38:26.960 --> 00:38:29.239
the text, right? It's a different paradigm. The world

00:38:29.240 --> 00:38:33.559
changes. Emacs kind of supports it all. That's why all the

00:38:33.560 --> 00:38:37.039
apps are built there. That's why when you're talking about

00:38:37.040 --> 00:38:40.759
that spiral. what that hints at is that this is really just a

00:38:40.760 --> 00:38:44.239
different algorithm that you're transferring out that

00:38:44.240 --> 00:38:47.319
makes some things a lot easier and some things a lot harder.

00:38:47.320 --> 00:38:51.719
That's why I was bringing in those three packages, because

00:38:51.720 --> 00:38:59.708
in some way it's making these search terms with reusable...

00:38:59.709 --> 00:39:07.083
Let's see... saveable buffers or interactive buffers in a way

00:39:07.084 --> 00:39:10.359
that... in a way, that is bigger than what I think it should have,

00:39:10.360 --> 00:39:15.479
especially in comparison to like how many people use

00:39:15.480 --> 00:39:20.319
YouTube, but I don't see very many YouTube apps that will

00:39:20.320 --> 00:39:26.279
show Rolling subtitle list that you can click on to move up

00:39:26.280 --> 00:39:27.315
and down the video

00:39:27.316 --> 00:39:30.139
even though YouTube's been around for years.

00:39:30.140 --> 00:39:33.359
Why does Emacs have a very good implementation

00:39:33.360 --> 00:39:37.159
that was duct taped together? So before I let you respond to

00:39:37.160 --> 00:39:40.439
that, Zac, let me just say we're coming up on eating up a

00:39:40.440 --> 00:39:43.879
whole half hour of your lunchtime and thank you for giving us

00:39:43.880 --> 00:39:47.879
that extra time. But let me just say, let's, you know, if I

00:39:47.880 --> 00:39:50.879
could ask you to take like up to another five minutes and then

00:39:50.880 --> 00:39:53.759
I'll try to kick us off here and make sure everybody does

00:39:53.760 --> 00:39:54.999
remember to eat.

00:39:55.000 --> 00:40:04.119
Yeah, so yeah, it looks like there's one other question. So

NOTE Q: Do you think the Emacs being kinda slow will get in the way of being able to run a lot of scoring algorithms?

00:40:04.120 --> 00:40:06.679
yeah, do you think Emacs being kind of slow will get in the way

00:40:06.680 --> 00:40:11.319
of being able to run a lot of scoring algorithms? So this is

00:40:11.320 --> 00:40:15.039
actually a thought I had. Yeah, Emacs, because the code

00:40:15.040 --> 00:40:19.919
currently kind of does, I mean, it kind of does, it's kind of

00:40:19.920 --> 00:40:24.039
dumb in a lot of places. a lot of times it just, it does just go

00:40:24.040 --> 00:40:27.599
through all the files and then just compute some score for

00:40:27.600 --> 00:40:30.679
them. But I'm surprised that it's, that part actually isn't

00:40:30.680 --> 00:40:34.799
that slow. Like, like it turns out like, okay, like if you

00:40:34.800 --> 00:40:40.759
take, for example, Emacs, like the Emacs directory or the

00:40:40.760 --> 00:40:44.879
Emacs Git repository, or maybe another big Git repository,

00:40:44.880 --> 00:40:49.079
like you could have an Elisp function enumerate those, and

00:40:49.080 --> 00:40:52.599
multiply some numbers, maybe multiply 10 numbers

00:40:52.600 --> 00:41:01.039
together. And that isn't that slow. And that's the bulk of

00:41:01.040 --> 00:41:05.799
what the only thing that Elisp has to do is just like multiply

00:41:05.800 --> 00:41:11.599
these numbers. Obviously, if you have to resort to Elisp to

00:41:11.600 --> 00:41:15.519
search all the files and you have like 10 or 100,000 files,

00:41:15.520 --> 00:41:18.759
then yeah, Emacs will be slow

00:41:18.760 --> 00:41:23.959
to manually search, like if you're not using ripgrep or any

00:41:23.960 --> 00:41:26.839
faster tool and you have, and you have millions of files and

00:41:26.840 --> 00:41:30.959
yeah, it will be slow. But what I noticed though is like, for

00:41:30.960 --> 00:41:35.119
example, let's say you want to search for, let's say you want

00:41:35.120 --> 00:41:40.199
to search like info directory, like info files for Emacs and

00:41:40.200 --> 00:41:46.039
the Emacs info file and the Elisp info file. So those are two

00:41:46.040 --> 00:41:49.279
decently sized kind of books, kind of like reference

00:41:49.280 --> 00:41:50.199
material on Emacs.

00:41:50.200 --> 00:41:55.999
Relying on Elisp to search both of those together, it's

00:41:56.000 --> 00:41:58.079
actually pretty, it's actually like almost instant. I

00:41:58.080 --> 00:42:00.639
mean, it's not slow enough. So I think that's

00:42:00.640 --> 00:42:03.679
another thing is like scale. Like I think on, on kind of like

00:42:03.680 --> 00:42:09.679
individual human level scales, I think Elisp can be good

00:42:09.680 --> 00:42:14.359
enough. if you're going on the scale of like enterprise,

00:42:14.360 --> 00:42:18.399
like all the repositories, all the Git repositories of an

00:42:18.400 --> 00:42:21.199
enterprise, then yeah, that scale might, it might, it might

00:42:21.200 --> 00:42:26.039
be too much. But I think on, on the scale of what most

00:42:26.040 --> 00:42:30.519
individuals have to deal with on a daily basis, like for

00:42:30.520 --> 00:42:34.719
example, maybe somebody has some, yeah, I mean, I think it

00:42:34.720 --> 00:42:36.959
should, I think it hopefully should be enough. And if not,

00:42:36.960 --> 00:42:39.639
there's always room for optimizations.

00:42:39.640 --> 00:42:55.999
Yeah, so so I'll redirect you a little bit because based on a

00:42:56.000 --> 00:43:00.279
couple of things I got into, you know, or if you want to be done

00:43:00.280 --> 00:43:04.759
be like, you know, give me the hi sign by all means and we can

00:43:04.760 --> 00:43:08.639
we can shut up shop, but I'm curious, you know, what are what

NOTE Boundary conditions

00:43:08.640 --> 00:43:13.079
are your boundary conditions? What what tends to cause you

00:43:13.080 --> 00:43:16.679
to to to write something more complicated and what what

00:43:16.680 --> 00:43:20.959
causes you to? So to work around it with more complex

00:43:20.960 --> 00:43:23.559
workflow in Emacs terms, like where do you break out the big

00:43:23.560 --> 00:43:27.919
guns? Just thinking about, like search, we talked about,

00:43:27.920 --> 00:43:31.439
maybe that's too abstract a question, but just general

00:43:31.440 --> 00:43:36.679
usage. Search is an example where almost all of us have

00:43:36.680 --> 00:43:39.599
probably written something to go find something, right?

00:43:39.600 --> 00:43:43.519
Yeah, I mean, this is a good question. I'm actually of the

00:43:43.520 --> 00:43:51.999
idea, at my work, for example, I tried to get rid of all, I

00:43:52.000 --> 00:43:54.879
mean, this is probably a typical Emacs user thing, but like,

00:43:54.880 --> 00:43:59.319
I mean, I think that just like getting, just like having

00:43:59.320 --> 00:44:02.559
Emacs expand to whatever it can get into and whatever it can

00:44:02.560 --> 00:44:08.839
automate, like any task, any, like, just like the more you

00:44:08.840 --> 00:44:13.719
can kind of get that coded, I actually find that kind of like,

00:44:13.720 --> 00:44:20.439
I mean, it is kind of like a meme. Like, yeah, I have to

00:44:20.440 --> 00:44:24.199
configure my Emacs until it's fun, and then I'll do it. But I

00:44:24.200 --> 00:44:27.959
actually I actually think that maybe for like a normal

00:44:27.960 --> 00:44:31.999
software developer, if you invest, if you invest, maybe,

00:44:32.000 --> 00:44:34.839
maybe you have like some spare time after you've done all

00:44:34.840 --> 00:44:39.679
your tasks, if you invest all that time in, in just like kind

00:44:39.680 --> 00:44:42.359
of going through all the workflows, all the, you know, just,

00:44:42.360 --> 00:44:46.279
just getting all of that in, in Emacs, then I think that that,

00:44:46.280 --> 00:44:52.039
that acts as kind of like a, it kind of like a productivity

00:44:52.040 --> 00:44:56.759
multiplier. And so. So I found that, I mean, I found to not

00:44:56.760 --> 00:44:59.519
have those boundaries. I mean, obviously there's things

00:44:59.520 --> 00:45:04.599
you can't do, like web-based things. I mean, that's a hard

00:45:04.600 --> 00:45:10.199
boundary, but that's more because... Yeah, there's really

00:45:10.200 --> 00:45:13.719
not much to do about that. Nobody's written a front-end

00:45:13.720 --> 00:45:18.759
engine, and too much of the forebrain is occupied with

00:45:18.760 --> 00:45:22.559
things that should happen on the "end-users

00:45:22.560 --> 00:45:29.839
infrastructure", so to speak. So with like 40 seconds left, I

00:45:29.840 --> 00:45:33.519
was going to say a minute, but I guess, any final thoughts?

00:45:33.520 --> 00:45:40.159
Yeah, I mean, just thank you for listening, and And thank you

00:45:40.160 --> 00:45:45.559
for putting this on. It's a really nice conference to have,

00:45:45.560 --> 00:45:50.679
and I'm glad things like this exist. So thank you. Yeah, it's

00:45:50.680 --> 00:45:54.639
you and the other folks on this call. Thank you so much,

00:45:54.640 --> 00:45:58.639
PlasmaStrike, and all the rest of you for hopping on the BBB

00:45:58.640 --> 00:46:03.119
and having such an interesting discussion. Keeps it really

00:46:03.120 --> 00:46:08.239
fun for us as organizers. And thanks, everybody, for being

00:46:08.240 --> 00:46:21.320
here.