summaryrefslogtreecommitdiffstats
path: root/2025/captions/emacsconf-2025-private-ai--emacs-and-private-ai-a-great-match--aaron-grothe--main.vtt
blob: a8e76986918049ee5d8acf1e8ae5b4bad34d1b52 (plain) (blame)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
1001
1002
1003
1004
1005
1006
1007
1008
1009
1010
1011
1012
1013
1014
1015
1016
1017
1018
1019
1020
1021
1022
1023
1024
1025
1026
1027
1028
1029
1030
1031
1032
1033
1034
1035
1036
1037
1038
1039
1040
1041
1042
1043
1044
1045
1046
1047
1048
1049
1050
1051
1052
1053
1054
1055
1056
1057
1058
1059
1060
1061
1062
1063
1064
1065
1066
1067
1068
1069
1070
1071
1072
1073
1074
1075
1076
1077
1078
1079
1080
1081
1082
1083
1084
1085
1086
1087
1088
1089
1090
1091
1092
1093
1094
1095
1096
1097
1098
1099
1100
1101
1102
1103
1104
1105
1106
1107
1108
1109
1110
1111
1112
1113
1114
1115
1116
1117
1118
1119
1120
1121
1122
1123
1124
1125
1126
1127
1128
1129
1130
1131
1132
1133
1134
1135
1136
1137
1138
1139
1140
1141
1142
1143
1144
1145
1146
1147
1148
1149
1150
1151
1152
1153
1154
1155
1156
1157
1158
1159
1160
1161
1162
1163
1164
1165
1166
1167
1168
1169
1170
1171
1172
1173
1174
1175
1176
1177
1178
1179
1180
1181
1182
1183
1184
1185
1186
1187
1188
1189
1190
1191
1192
1193
1194
1195
1196
1197
1198
1199
1200
1201
1202
1203
1204
1205
1206
1207
1208
1209
1210
1211
1212
1213
1214
1215
1216
1217
1218
1219
1220
1221
1222
1223
1224
1225
1226
1227
1228
1229
1230
1231
1232
1233
1234
1235
1236
1237
1238
1239
1240
1241
1242
1243
1244
1245
1246
1247
1248
1249
1250
1251
1252
1253
1254
1255
1256
1257
1258
1259
1260
1261
1262
1263
1264
1265
1266
1267
1268
1269
1270
1271
1272
1273
1274
1275
1276
1277
1278
1279
1280
1281
1282
1283
1284
1285
1286
1287
1288
1289
1290
1291
1292
1293
1294
1295
1296
1297
1298
1299
1300
1301
1302
1303
1304
1305
1306
1307
1308
1309
1310
1311
1312
1313
1314
1315
1316
1317
1318
1319
1320
1321
1322
1323
1324
1325
1326
1327
1328
1329
1330
1331
1332
1333
1334
1335
1336
1337
1338
1339
1340
1341
1342
1343
1344
1345
1346
1347
1348
1349
1350
1351
1352
1353
1354
1355
1356
1357
1358
1359
1360
1361
1362
1363
1364
1365
1366
1367
1368
1369
1370
1371
1372
1373
1374
1375
1376
1377
1378
1379
1380
1381
1382
1383
1384
1385
1386
1387
1388
1389
1390
1391
1392
1393
1394
1395
1396
1397
1398
1399
1400
1401
1402
1403
1404
1405
1406
1407
1408
1409
1410
1411
1412
1413
1414
1415
1416
1417
1418
1419
1420
1421
1422
1423
1424
1425
1426
1427
1428
1429
1430
1431
1432
1433
1434
1435
1436
1437
1438
1439
1440
1441
1442
1443
1444
1445
1446
1447
1448
1449
1450
1451
1452
1453
1454
1455
1456
1457
1458
1459
1460
1461
1462
1463
1464
1465
1466
1467
1468
1469
1470
1471
1472
1473
1474
1475
1476
1477
1478
1479
1480
1481
1482
1483
1484
1485
1486
1487
1488
1489
1490
1491
1492
1493
1494
1495
1496
1497
1498
1499
1500
1501
1502
1503
1504
1505
1506
1507
1508
1509
1510
1511
1512
1513
1514
1515
1516
1517
1518
1519
1520
1521
1522
1523
1524
1525
1526
1527
1528
1529
1530
1531
1532
1533
1534
1535
1536
1537
1538
1539
1540
1541
1542
1543
1544
1545
1546
1547
1548
1549
1550
1551
1552
1553
1554
1555
1556
1557
1558
1559
1560
1561
1562
1563
1564
1565
1566
1567
1568
1569
1570
1571
1572
1573
1574
1575
1576
1577
1578
1579
1580
1581
1582
1583
1584
1585
1586
1587
1588
1589
1590
1591
1592
1593
1594
1595
1596
1597
1598
1599
1600
1601
1602
1603
1604
1605
1606
1607
1608
1609
1610
1611
1612
1613
1614
1615
1616
1617
1618
1619
1620
1621
1622
1623
1624
1625
1626
1627
1628
1629
1630
1631
1632
1633
1634
1635
1636
1637
1638
1639
1640
1641
1642
1643
1644
1645
1646
1647
1648
1649
1650
1651
1652
1653
1654
1655
1656
1657
1658
1659
1660
1661
1662
1663
1664
1665
1666
1667
1668
1669
1670
1671
1672
1673
1674
1675
1676
1677
1678
1679
1680
1681
1682
1683
1684
1685
1686
1687
1688
1689
1690
1691
1692
1693
1694
1695
1696
1697
1698
1699
1700
1701
1702
1703
1704
1705
1706
1707
1708
1709
1710
1711
1712
1713
1714
1715
1716
1717
1718
1719
1720
1721
1722
1723
1724
1725
1726
1727
1728
1729
1730
1731
1732
1733
1734
1735
1736
1737
1738
1739
1740
1741
1742
1743
1744
1745
1746
1747
1748
1749
1750
1751
1752
1753
1754
1755
1756
1757
1758
1759
1760
1761
1762
1763
1764
1765
1766
1767
1768
1769
1770
1771
1772
1773
1774
1775
1776
1777
1778
1779
1780
1781
1782
1783
1784
1785
1786
1787
1788
1789
1790
1791
1792
1793
1794
1795
1796
1797
1798
1799
1800
1801
1802
1803
1804
1805
1806
1807
1808
1809
1810
1811
1812
1813
1814
1815
1816
1817
1818
1819
1820
1821
1822
1823
1824
1825
1826
1827
1828
1829
1830
1831
1832
1833
1834
1835
1836
1837
1838
1839
1840
1841
1842
1843
1844
1845
1846
1847
1848
1849
1850
1851
1852
1853
1854
1855
1856
1857
1858
1859
1860
1861
1862
1863
1864
1865
1866
1867
1868
1869
1870
1871
1872
1873
1874
1875
1876
1877
1878
1879
1880
1881
1882
1883
1884
1885
1886
1887
1888
1889
1890
1891
1892
1893
1894
1895
1896
1897
1898
1899
1900
1901
1902
1903
1904
1905
1906
1907
1908
1909
1910
1911
1912
1913
1914
1915
1916
1917
1918
1919
1920
1921
1922
1923
1924
1925
1926
1927
1928
1929
1930
1931
1932
1933
1934
1935
1936
1937
1938
1939
1940
1941
1942
1943
1944
1945
1946
1947
1948
1949
1950
1951
1952
1953
1954
1955
1956
1957
1958
1959
1960
1961
1962
1963
1964
1965
1966
1967
1968
1969
1970
1971
1972
1973
1974
1975
1976
1977
1978
1979
1980
1981
1982
1983
1984
1985
1986
1987
1988
1989
1990
1991
1992
1993
1994
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
2026
2027
2028
2029
2030
2031
2032
2033
2034
2035
2036
2037
2038
2039
2040
2041
2042
2043
2044
2045
2046
2047
2048
2049
2050
2051
2052
2053
2054
2055
2056
2057
2058
2059
2060
2061
2062
2063
2064
2065
2066
2067
2068
2069
2070
2071
2072
2073
2074
2075
2076
2077
2078
2079
2080
2081
2082
2083
2084
2085
2086
2087
2088
2089
2090
2091
2092
2093
2094
2095
2096
2097
2098
2099
2100
2101
2102
2103
2104
2105
2106
2107
2108
2109
2110
2111
2112
2113
2114
2115
2116
2117
2118
2119
2120
2121
2122
2123
2124
2125
2126
2127
2128
2129
2130
2131
2132
2133
2134
2135
2136
2137
2138
2139
2140
2141
2142
2143
2144
2145
2146
2147
2148
2149
2150
2151
2152
2153
2154
2155
2156
2157
2158
2159
2160
2161
2162
2163
2164
2165
2166
2167
2168
2169
2170
2171
2172
2173
2174
2175
2176
2177
2178
2179
2180
2181
2182
2183
2184
2185
2186
2187
2188
2189
2190
2191
2192
2193
2194
2195
2196
2197
2198
2199
2200
2201
2202
2203
2204
2205
2206
2207
2208
2209
2210
2211
2212
2213
2214
2215
2216
2217
2218
2219
2220
2221
2222
2223
2224
2225
2226
2227
2228
2229
2230
2231
2232
2233
2234
2235
2236
2237
2238
2239
2240
2241
2242
2243
2244
2245
2246
2247
2248
2249
2250
2251
2252
2253
2254
2255
2256
2257
2258
2259
2260
2261
2262
2263
2264
2265
2266
2267
2268
2269
2270
2271
2272
2273
2274
2275
2276
2277
2278
2279
2280
2281
2282
2283
2284
2285
2286
2287
2288
2289
2290
2291
2292
2293
2294
2295
2296
2297
2298
2299
2300
2301
2302
2303
2304
2305
2306
2307
2308
2309
2310
2311
2312
2313
2314
2315
2316
2317
2318
2319
2320
2321
2322
2323
2324
2325
2326
2327
2328
2329
2330
2331
2332
2333
2334
2335
2336
2337
2338
2339
2340
2341
2342
2343
2344
2345
2346
2347
2348
2349
2350
2351
2352
2353
2354
2355
2356
2357
2358
2359
2360
2361
2362
2363
2364
2365
2366
2367
2368
2369
2370
2371
2372
2373
2374
2375
2376
2377
2378
2379
2380
2381
2382
2383
2384
2385
2386
2387
2388
2389
2390
2391
2392
2393
2394
2395
2396
2397
2398
2399
2400
2401
2402
2403
2404
2405
2406
2407
2408
2409
2410
2411
2412
2413
2414
2415
2416
2417
2418
2419
2420
2421
2422
2423
2424
2425
2426
2427
2428
2429
2430
2431
2432
2433
2434
2435
2436
2437
2438
2439
2440
2441
2442
2443
2444
2445
2446
2447
2448
2449
2450
2451
2452
2453
2454
2455
2456
2457
2458
2459
2460
2461
2462
2463
2464
2465
2466
2467
2468
2469
2470
2471
2472
2473
2474
2475
2476
2477
2478
2479
2480
2481
2482
2483
2484
2485
2486
2487
2488
2489
2490
2491
2492
2493
2494
2495
2496
2497
2498
2499
2500
2501
2502
2503
2504
2505
2506
2507
2508
2509
2510
2511
2512
2513
2514
2515
2516
2517
2518
2519
2520
2521
2522
2523
2524
2525
2526
2527
2528
2529
2530
2531
2532
2533
2534
2535
2536
2537
2538
2539
2540
2541
2542
2543
2544
2545
2546
2547
2548
2549
2550
2551
2552
2553
2554
2555
2556
2557
2558
2559
2560
2561
2562
2563
2564
2565
2566
2567
2568
2569
2570
2571
2572
2573
2574
2575
2576
2577
2578
2579
2580
2581
2582
2583
2584
2585
2586
2587
2588
2589
2590
2591
2592
2593
2594
2595
2596
2597
2598
2599
2600
2601
2602
WEBVTT

00:00:00.000 --> 00:00:04.859
Hey, everybody. Welcome from frigid Omaha, Nebraska.

00:00:04.860 --> 00:00:06.619
I'm just going to kick off my talk here,

00:00:06.620 --> 00:00:23.899
and we'll see how it all goes. Thanks for attending.

00:00:23.900 --> 00:00:26.939
So the slides will be available on my site, growthy.us,

00:00:26.940 --> 00:00:29.899
in the presentation section tonight or tomorrow.

00:00:29.900 --> 00:00:33.099
This is a quick intro to one way to do private AI in Emacs.

00:00:33.100 --> 00:00:35.299
There are a lot of other ways to do it.

00:00:35.300 --> 00:00:38.899
This one is really just more or less the easiest way to do it.

00:00:38.900 --> 00:00:40.379
It's a minimal viable product

00:00:40.380 --> 00:00:42.379
to get you an idea of how to get started with it

00:00:42.380 --> 00:00:43.859
and how to give it a spin.

00:00:43.860 --> 00:00:45.819
Really hope some of you give it a shot

00:00:45.820 --> 00:00:48.179
and learn something along the way.

00:00:48.180 --> 00:00:50.379
So the overview of the talk.

00:00:50.380 --> 00:00:54.939
broke down these basic bullet points of why private AI,

00:00:54.940 --> 00:00:58.939
what do I need to do private AI, Emacs and private AI,

00:00:58.940 --> 00:01:02.739
pieces for an AI Emacs solution,

00:01:02.740 --> 00:01:08.059
a demo of a minimal viable product, and the summary.

00:01:08.060 --> 00:01:10.779
Why private AI? This is pretty simple.

00:01:10.780 --> 00:01:12.099
Just read the terms and conditions

00:01:12.100 --> 00:01:14.819
for any AI system you're currently using.

00:01:14.820 --> 00:01:17.019
If you're using the free tiers, your queries,

00:01:17.020 --> 00:01:18.619
code uploaded information

00:01:18.620 --> 00:01:20.699
is being used to train the models.

00:01:20.700 --> 00:01:22.939
In some cases, you are giving the company

00:01:22.940 --> 00:01:25.419
a perpetual license to your data.

00:01:25.420 --> 00:01:27.059
You have no control over this,

00:01:27.060 --> 00:01:29.219
except for not using the engine.

00:01:29.220 --> 00:01:30.699
And keep in mind, the terms

00:01:30.700 --> 00:01:32.179
are changing all the time on that,

00:01:32.180 --> 00:01:34.139
and they're not normally changing for our benefit.

00:01:34.140 --> 00:01:38.259
So that's not necessarily a good thing.

00:01:38.260 --> 00:01:40.339
If you're using the paid tiers,

00:01:40.340 --> 00:01:43.459
you may be able to opt out of the data collection.

00:01:43.460 --> 00:01:45.539
But keep in mind, this can change,

00:01:45.540 --> 00:01:48.619
or they may start charging for that option.

00:01:48.620 --> 00:01:51.419
Every AI company wants more and more data.

00:01:51.420 --> 00:01:53.779
They need more and more data to train their models.

00:01:53.780 --> 00:01:56.019
It is just the way it is.

00:01:56.020 --> 00:01:57.899
They need more and more information

00:01:57.900 --> 00:02:00.459
to get it more and more accurate to keep it up to date.

00:02:00.460 --> 00:02:03.219
There's been a story about Stack Overflow.

00:02:03.220 --> 00:02:05.819
It has like half the number of queries they had a year ago

00:02:05.820 --> 00:02:07.379
because people are using AI.

00:02:07.380 --> 00:02:08.579
The problem with that is now

00:02:08.580 --> 00:02:10.379
there's less data going to Stack Overflow

00:02:10.380 --> 00:02:12.979
for the AI to get. vicious cycle,

00:02:12.980 --> 00:02:14.619
especially when you start looking at

00:02:14.620 --> 00:02:16.579
newer language like Ruby and stuff like that.

00:02:16.580 --> 00:02:21.419
So it comes down to being an interesting time.

00:02:21.420 --> 00:02:24.739
Another reason why to go private AI is your costs are going to vary.

00:02:24.740 --> 00:02:27.019
Right now, these services are being heavily subsidized.

00:02:27.020 --> 00:02:29.419
If you're paying Claude $20 a month,

00:02:29.420 --> 00:02:32.579
it is not costing Claude, those guys $20 a month

00:02:32.580 --> 00:02:34.099
to host all the infrastructure

00:02:34.100 --> 00:02:35.619
to build all these data centers.

00:02:35.620 --> 00:02:38.779
They are severely subsidizing that

00:02:38.780 --> 00:02:41.259
at a very much a loss right now.

00:02:41.260 --> 00:02:43.659
When they start charging the real costs plus a profit,

00:02:43.660 --> 00:02:45.499
it's going to change.

00:02:45.500 --> 00:02:48.019
Right now, I use a bunch of different services.

00:02:48.020 --> 00:02:50.019
I've played with Grok and a bunch of other ones.

00:02:50.020 --> 00:02:52.459
But Grok right now is like $30 a month

00:02:52.460 --> 00:02:54.139
for a regular Super Grok.

00:02:54.140 --> 00:02:56.419
When they start charging the real cost of that,

00:02:56.420 --> 00:02:59.819
it's going to go from $30 to something a great deal more,

00:02:59.820 --> 00:03:02.379
perhaps, I think, $100 or $200

00:03:02.380 --> 00:03:04.459
or whatever really turns out to be the cost

00:03:04.460 --> 00:03:06.059
when you figure everything into it.

00:03:06.060 --> 00:03:07.539
When you start adding that cost into that,

00:03:07.540 --> 00:03:10.179
a lot of people are using public AI right now

00:03:10.180 --> 00:03:11.899
are going to have no option but to move to private AI

00:03:11.900 --> 00:03:16.019
or give up on AI overall.

00:03:16.020 --> 00:03:18.659
What do you need to be able to do private AI?

00:03:18.660 --> 00:03:21.179
If you're going to run your own AI,

00:03:21.180 --> 00:03:23.579
you're going to need a system with either some cores,

00:03:23.580 --> 00:03:25.699
a graphics processor unit,

00:03:25.700 --> 00:03:28.339
or a neural processing unit, a GPU or an NPU.

00:03:28.340 --> 00:03:29.819
I currently have four systems

00:03:29.820 --> 00:03:32.979
I'm experimenting with and playing around with on a daily basis.

00:03:32.980 --> 00:03:37.979
I have a System76 Pangolin AMD Ryzen 7 78040U

00:03:37.980 --> 00:03:41.099
with a Radeon 7080M integrated graphics card.

00:03:41.100 --> 00:03:42.539
It's got 32 gigs of RAM.

00:03:42.540 --> 00:03:45.259
It's a beautiful piece of hardware. I really do like it.

00:03:45.260 --> 00:03:46.499
I have my main workstation,

00:03:46.500 --> 00:03:50.579
it's an HP Z620 with dual Intel Xeons

00:03:50.580 --> 00:03:53.179
with four NVIDIA K2200 graphics cards in it.

00:03:53.180 --> 00:03:56.699
Why the four NVIDIA K2200 graphics card on it?

00:03:56.700 --> 00:03:59.739
Because I could buy four of them on eBay for $100

00:03:59.740 --> 00:04:02.379
and it was still supported by the NVIDIA drivers for Debian.

00:04:02.380 --> 00:04:08.179
So that's why that is. A MacBook Air with an M1 processor,

00:04:08.180 --> 00:04:10.939
a very nice piece of kit I picked up a couple years ago,

00:04:10.940 --> 00:04:14.139
very cheap, but it runs AI surprisingly well,

00:04:14.140 --> 00:04:18.099
and an Acer Aspire 1 with an AMD Ryzen 5700H in it.

00:04:18.100 --> 00:04:22.099
This was my old laptop. It was a sturdy beast.

00:04:22.100 --> 00:04:24.379
It was able to do enough AI to do demos and stuff,

00:04:24.380 --> 00:04:25.859
and I liked it quite a bit for that.

00:04:25.860 --> 00:04:28.339
I'm using the Pangolin for this demonstration

00:04:28.340 --> 00:04:30.979
because it's just better.

00:04:30.980 --> 00:04:37.219
Apple's M4 chip has 38 teraflops of MPU performance.

00:04:37.220 --> 00:04:40.099
The Microsoft co-pilots are now requiring

00:04:40.100 --> 00:04:41.459
45 teraflops of MPU

00:04:41.460 --> 00:04:43.939
to be able to have the co-pilot badge on it.

00:04:43.940 --> 00:04:48.299
And Raspberry Pi's new AI top is about 18 teraflops

00:04:48.300 --> 00:04:51.219
and is $70 on top of the cost of Raspberry Pi 5.

00:04:51.220 --> 00:04:56.059
Keep in mind Raspberry recently

00:04:56.060 --> 00:04:59.499
raised the cost of their Pi 5s because of RAM pricing,

00:04:59.500 --> 00:05:00.379
which is going to be affecting

00:05:00.380 --> 00:05:02.459
a lot of these types of solutions in the near future.

00:05:02.460 --> 00:05:05.299
But there's going to be a lot of

00:05:05.300 --> 00:05:06.699
local power available in the future.

00:05:06.700 --> 00:05:08.219
That's what it really comes down to.

00:05:08.220 --> 00:05:11.179
A lot of people are going to have PCs on their desks.

00:05:11.180 --> 00:05:13.459
They're going to run a decent private AI

00:05:13.460 --> 00:05:18.059
without much issue. So for Emacs and private AI,

00:05:18.060 --> 00:05:20.139
there's a couple popular solutions.

00:05:20.140 --> 00:05:22.099
Gptel, which is the one we're going to talk about.

00:05:22.100 --> 00:05:24.739
It's a simple interface. It's a minimal interface.

00:05:24.740 --> 00:05:26.579
It integrates easily into your workflow.

00:05:26.580 --> 00:05:29.019
It's just, quite honestly, chef's kiss,

00:05:29.020 --> 00:05:31.059
just a beautifully well-done piece of software.

00:05:31.060 --> 00:05:33.859
OllamaBuddy has more features,

00:05:33.860 --> 00:05:36.259
a menu interface, has quick access

00:05:36.260 --> 00:05:37.499
for things like code refactoring,

00:05:37.500 --> 00:05:38.979
text-free formatting, et cetera.

00:05:38.980 --> 00:05:41.979
This is the one that you spend a little more time with,

00:05:41.980 --> 00:05:43.939
but you also get a little bit more back from it.

00:05:43.940 --> 00:05:49.419
Elama is another one, has some really good features to it,

00:05:49.420 --> 00:05:51.059
more different capabilities,

00:05:51.060 --> 00:05:54.979
but it's a different set of rules and capabilities to it.

00:05:54.980 --> 00:05:59.179
Itermac, which is programming with your AI and Emacs.

00:05:59.180 --> 00:06:01.219
The closest thing I can come up

00:06:01.220 --> 00:06:04.139
to comparing this to is Cursor, except it's an Emacs.

00:06:04.140 --> 00:06:05.659
It's really quite well done.

00:06:05.660 --> 00:06:07.299
These are all really quite well done.

00:06:07.300 --> 00:06:08.499
There's a bunch of other projects out there.

00:06:08.500 --> 00:06:10.819
If you go out to GitHub, type Emacs AI,

00:06:10.820 --> 00:06:13.219
you'll find a lot of different options.

00:06:13.220 --> 00:06:18.459
So what is a minimal viable product that can be done?

00:06:18.460 --> 00:06:23.379
A minimal viable product to show what an AI Emacs solution is

00:06:23.380 --> 00:06:27.179
can be done with only needing two pieces of software.

00:06:27.180 --> 00:06:31.179
Llamafile, this is an amazing piece of software.

00:06:31.180 --> 00:06:32.899
This is a whole LLM contained in one file.

00:06:32.900 --> 00:06:36.059
And the same file runs on Mac OS X,

00:06:36.060 --> 00:06:39.379
Linux, Windows, and the BSDs.

00:06:39.380 --> 00:06:42.179
It's a wonderful piece of kit

00:06:42.180 --> 00:06:44.179
based on these people who created

00:06:44.180 --> 00:06:45.899
this thing called Cosmopolitan

00:06:45.900 --> 00:06:46.779
that lets you create and execute

00:06:46.780 --> 00:06:48.699
while it runs on a bunch of different systems.

00:06:48.700 --> 00:06:51.299
And Gptel, which is an easy plug-in for Emacs,

00:06:51.300 --> 00:06:54.979
which we talked about in the last slide a bit.

00:06:54.980 --> 00:07:00.179
So setting up the LLM, you have to just go out

00:07:00.180 --> 00:07:01.699
and just hit the a page for it

00:07:01.700 --> 00:07:05.099
and go out and do a wget of it.

00:07:05.100 --> 00:07:07.099
That's all it takes there.

00:07:07.100 --> 00:07:10.259
Chmodding it so you can actually execute the executable.

00:07:10.260 --> 00:07:12.939
And then just go ahead and actually running it.

00:07:12.940 --> 00:07:16.939
And let's go ahead and do that.

00:07:16.940 --> 00:07:18.899
I've already downloaded it because I don't want to wait.

00:07:18.900 --> 00:07:21.259
And let's just take a look at it.

00:07:21.260 --> 00:07:22.899
I've actually downloaded several of them,

00:07:22.900 --> 00:07:25.699
but let's go ahead and just run lava 3.2b

00:07:25.700 --> 00:07:31.179
with the 3 billion instructions. And that's it firing up.

00:07:31.180 --> 00:07:33.899
And it is nice enough to actually be listening in port 8080,

00:07:33.900 --> 00:07:35.339
which we'll need in a minute.

00:07:35.340 --> 00:07:43.139
So once you do that, you have to install gptel and emacs.

00:07:43.140 --> 00:07:45.659
That's as simple as firing up emacs,

00:07:45.660 --> 00:07:48.339
doing the meta x install package,

00:07:48.340 --> 00:07:49.779
and then just typing gptel

00:07:49.780 --> 00:07:51.499
if you have your repository set up right,

00:07:51.500 --> 00:07:52.299
which hopefully you do.

00:07:52.300 --> 00:07:54.499
And then you just go ahead and have it.

00:07:54.500 --> 00:07:58.139
You also have to set up a config file.

00:07:58.140 --> 00:08:01.739
Here's my example config file as it currently set up,

00:08:01.740 --> 00:08:04.019
requiring ensuring Gptel is loaded,

00:08:04.020 --> 00:08:05.899
defining the Llamafile backend.

00:08:05.900 --> 00:08:07.779
You can put multiple backends into it,

00:08:07.780 --> 00:08:09.859
but I just have the one defined on this example.

00:08:09.860 --> 00:08:12.059
But it's pretty straightforward.

00:08:12.060 --> 00:08:16.739
Llama local file, name for it, stream, protocol HTTP.

00:08:16.740 --> 00:08:20.859
If you have HTTPS set up, that's obviously preferable,

00:08:20.860 --> 00:08:22.779
but a lot of people don't for their home labs.

00:08:22.780 --> 00:08:26.379
Host is just 127.0.0.1 port 8080.

00:08:26.380 --> 00:08:30.099
Keep in mind, some of the AIs run on a different port,

00:08:30.100 --> 00:08:31.499
so you may be 8081

00:08:31.500 --> 00:08:34.619
if you're running OpenWebView at the same time. The key,

00:08:34.620 --> 00:08:37.019
we don't need an API key because it's a local server.

00:08:37.020 --> 00:08:40.259
And the models just, uh, we can put multiple models

00:08:40.260 --> 00:08:41.339
on there if we want to.

00:08:41.340 --> 00:08:43.699
So if we create one with additional stuff

00:08:43.700 --> 00:08:45.379
or like rag and stuff like that,

00:08:45.380 --> 00:08:47.459
we can actually name those models by their domain,

00:08:47.460 --> 00:08:48.699
which is really kind of cool.

00:08:48.700 --> 00:08:52.099
But, uh, that's all that takes.

00:08:52.100 --> 00:09:03.779
So let's go ahead and go to a quick test of it.

00:09:03.780 --> 00:09:11.019
Oops. Alt-X, gptel. And we're going to just choose

00:09:11.020 --> 00:09:12.499
the default buffer to make things easier.

00:09:12.500 --> 00:09:15.339
Going to resize it up a bit.

00:09:15.340 --> 00:09:19.859
And usually the go-to question I go to is, who was David Bowie?

00:09:19.860 --> 00:09:24.499
This one is actually a question

00:09:24.500 --> 00:09:26.219
that's turned out to be really good

00:09:26.220 --> 00:09:28.019
for figuring out whether or not AI is complete.

00:09:28.020 --> 00:09:31.139
This is one that some engines do well on, other ones don't.

00:09:31.140 --> 00:09:33.739
And we can just do, we can either do

00:09:33.740 --> 00:09:36.059
the alt X and send the gptel-send,

00:09:36.060 --> 00:09:37.979
or we can just do control C and hit enter.

00:09:37.980 --> 00:09:39.139
We'll just do control C and enter.

00:09:39.140 --> 00:09:43.659
And now it's going ahead and hitting our local AI system

00:09:43.660 --> 00:09:46.659
running on port 8080. And that looks pretty good,

00:09:46.660 --> 00:09:50.739
but let's go ahead and say, hey, it's set to terse mode right now.

00:09:50.740 --> 00:10:03.859
Please expand upon this. And there we go.

00:10:03.860 --> 00:10:05.379
We're getting a full description

00:10:05.380 --> 00:10:08.739
of the majority of, uh, about David Bowie's life

00:10:08.740 --> 00:10:10.139
and other information about him.

00:10:10.140 --> 00:10:21.699
So very, very happy with that.

00:10:21.700 --> 00:10:23.539
One thing to keep in mind is you look at things

00:10:23.540 --> 00:10:24.699
when you're looking for hallucinations,

00:10:24.700 --> 00:10:26.899
how accurate AI is, how it's compressed

00:10:26.900 --> 00:10:29.259
is it will tend to screw up on things like

00:10:29.260 --> 00:10:30.859
how many children he had and stuff like that.

00:10:30.860 --> 00:10:32.459
Let me see if it gets to that real quick.

00:10:32.460 --> 00:10:39.739
Is it not actually on this one?

00:10:39.740 --> 00:10:42.179
Alright, so that's the first question I always ask one.

00:10:42.180 --> 00:10:44.659
The next one is what are sea monkeys?

00:10:44.660 --> 00:10:48.979
It gives you an idea of the breadth of the system.

00:10:48.980 --> 00:11:10.619
It's querying right now. Pulls it back correctly. Yes.

00:11:10.620 --> 00:11:12.339
And it's smart enough to actually detect David Bowie

00:11:12.340 --> 00:11:15.019
even referenced see monkeys in the song sea of love,

00:11:15.020 --> 00:11:16.179
which came at hit single.

00:11:16.180 --> 00:11:18.859
So it's actually keeping the context alive

00:11:18.860 --> 00:11:20.419
and that which is very cool feature.

00:11:20.420 --> 00:11:21.459
I did not see that coming.

00:11:21.460 --> 00:11:24.139
Here's one that some people say is a really good one

00:11:24.140 --> 00:11:25.739
to ask ours in strawberry.

00:11:25.740 --> 00:11:46.179
All right, now she's going off the reservation.

00:11:46.180 --> 00:11:48.139
She's going in a different direction.

00:11:48.140 --> 00:11:49.979
Let me go ahead and reopen that again,

00:11:49.980 --> 00:11:52.979
because it's went down a bad hole there for a second.

00:11:52.980 --> 00:11:58.419
Let me ask it to do write hello world in Emacs list.

00:11:58.420 --> 00:12:10.419
Yep, that works. So the point being here,

00:12:10.420 --> 00:12:14.939
that was like two minutes of setup.

00:12:14.940 --> 00:12:18.019
And now we have a small AI embedded inside the system.

00:12:18.020 --> 00:12:20.539
So that gives you an idea just how easy it can be.

00:12:20.540 --> 00:12:22.299
And it's just running locally on the system.

00:12:22.300 --> 00:12:25.259
We also have the default system here as well.

00:12:25.260 --> 00:12:32.579
So not that bad.

00:12:32.580 --> 00:12:35.379
That's a basic solution, that's a basic setup

00:12:35.380 --> 00:12:37.059
that will get you to the point where you can go like,

00:12:37.060 --> 00:12:39.859
it's a party trick, but it's a very cool party trick.

00:12:39.860 --> 00:12:42.859
The way that Gptel works is it puts it into buffers,

00:12:42.860 --> 00:12:45.099
it doesn't interfere with your flow that much,

00:12:45.100 --> 00:12:47.179
it's just an additional window you can pop open

00:12:47.180 --> 00:12:49.019
to ask questions and get information for,

00:12:49.020 --> 00:12:51.459
dump code into it and have it refactored.

00:12:51.460 --> 00:12:53.339
Gptel has a lot of additional options

00:12:53.340 --> 00:12:55.699
for things that are really cool for that.

00:12:55.700 --> 00:12:57.099
But if you want a better solution,

00:12:57.100 --> 00:12:59.939
I recommend Ollama or LM Studio.

00:12:59.940 --> 00:13:01.899
They're both more capable than llama file.

00:13:01.900 --> 00:13:03.859
They can accept a lot of different models.

00:13:03.860 --> 00:13:05.739
You can do things like RAG.

00:13:05.740 --> 00:13:09.219
You can do loading of things onto the GPU more explicitly.

00:13:09.220 --> 00:13:10.379
It can speed stuff up.

00:13:10.380 --> 00:13:13.059
One of the things about the retrieval augmentation is

00:13:13.060 --> 00:13:15.539
it will let you put your data into the system

00:13:15.540 --> 00:13:17.779
so you can start uploading your code, your information,

00:13:17.780 --> 00:13:20.139
and actually being able to do analysis of it.

00:13:20.140 --> 00:13:23.539
OpenWebUI provides more capabilities.

00:13:23.540 --> 00:13:24.859
It provides an interface that's similar

00:13:24.860 --> 00:13:25.899
to what you're used to seeing

00:13:25.900 --> 00:13:28.179
for chat, GPT, and the other systems.

00:13:28.180 --> 00:13:29.419
It's really quite well done.

00:13:29.420 --> 00:13:32.539
And once again, gptel, I have to mention that

00:13:32.540 --> 00:13:34.779
because that's the one I really kind of like.

00:13:34.780 --> 00:13:36.899
And OlamaBuddy is also another really nice one.

00:13:36.900 --> 00:13:41.019
So what about the licensing of these models?

00:13:41.020 --> 00:13:42.299
Since I'm going out pulling down

00:13:42.300 --> 00:13:43.579
a model and doing this stuff.

00:13:43.580 --> 00:13:46.579
Let's take a look at a couple of highlights

00:13:46.580 --> 00:13:49.379
from the MetaLlama 3 community license scale.

00:13:49.380 --> 00:13:52.579
If your service exceeds 700 million monthly users,

00:13:52.580 --> 00:13:54.099
you need additional licensing.

00:13:54.100 --> 00:13:56.099
Probably not going to be a problem for most of us.

00:13:56.100 --> 00:13:58.379
There's a competition restriction.

00:13:58.380 --> 00:14:00.899
You can't use this model to enhance competing models.

00:14:00.900 --> 00:14:04.219
And there's some limitations on using the Meta trademarks.

00:14:04.220 --> 00:14:05.939
Not that big a deal.

00:14:05.940 --> 00:14:09.139
And the other ones are it's a permissive one

00:14:09.140 --> 00:14:10.939
designed to encourage innovation,

00:14:10.940 --> 00:14:13.779
open development, commercial use is allowed,

00:14:13.780 --> 00:14:15.219
but there are some restrictions on it.

00:14:15.220 --> 00:14:17.259
Yeah, you can modify the model,

00:14:17.260 --> 00:14:20.419
but you have to rely on the license terms.

00:14:20.420 --> 00:14:22.339
And you can distribute the model with derivatives.

00:14:22.340 --> 00:14:24.059
And there are some very cool ones out there.

00:14:24.060 --> 00:14:25.259
There's people who've done things

00:14:25.260 --> 00:14:29.579
to try and make the llama bee less, what's the phrase,

00:14:29.580 --> 00:14:31.939
ethical if you're doing penetration testing research

00:14:31.940 --> 00:14:32.619
and stuff like that.

00:14:32.620 --> 00:14:34.459
It has some very nice value there.

00:14:34.460 --> 00:14:37.739
Keep in mind licenses also vary

00:14:37.740 --> 00:14:39.619
depending on the model you're using.

00:14:39.620 --> 00:14:42.419
Mistral AI has the non-production license.

00:14:42.420 --> 00:14:45.219
It's designed to keep it to research and development.

00:14:45.220 --> 00:14:46.739
You can't use it commercially.

00:14:46.740 --> 00:14:50.419
So it's designed to clearly delineate

00:14:50.420 --> 00:14:52.939
between research and development

00:14:52.940 --> 00:14:54.259
and somebody trying to actually build

00:14:54.260 --> 00:14:55.379
something on top of it.

00:14:55.380 --> 00:14:57.979
And another question I get asked is,

00:14:57.980 --> 00:14:59.899
are there open source data model options?

00:14:59.900 --> 00:15:02.819
Yeah, but most of them are small or specialized currently.

00:15:02.820 --> 00:15:05.499
MoMo is a whole family of them,

00:15:05.500 --> 00:15:07.339
but there tend to be more specialized,

00:15:07.340 --> 00:15:09.019
but it's very cool to see where it's going.

00:15:09.020 --> 00:15:11.339
And it's another thing that's just going forward.

00:15:11.340 --> 00:15:13.379
It's under the MIT license.

00:15:13.380 --> 00:15:15.819
Some things to know to help you

00:15:15.820 --> 00:15:17.499
have a better experience with this.

00:15:17.500 --> 00:15:21.059
Get a Llama and OpenWebUI working by themselves,

00:15:21.060 --> 00:15:22.659
then set up your config file.

00:15:22.660 --> 00:15:24.819
I was fighting both at the same time,

00:15:24.820 --> 00:15:26.699
and it turned out I had a problem with my LLAMA.

00:15:26.700 --> 00:15:28.899
I had a conflict, so that was what my problem is.

00:15:28.900 --> 00:15:32.819
Llamafile, gptel is a great way to start experimenting

00:15:32.820 --> 00:15:34.299
just to get you an idea of how it works

00:15:34.300 --> 00:15:36.939
and figure out how the interfaces work. Tremendous.

00:15:36.940 --> 00:15:40.739
RAG loading documents into it is really easy with open web UI.

00:15:40.740 --> 00:15:43.019
You can create models, you can put things like

00:15:43.020 --> 00:15:46.419
help desk developers and stuff like that, breaking it out.

00:15:46.420 --> 00:15:51.019
The Hacker News has a how to build a $300 AI computer.

00:15:51.020 --> 00:15:52.859
This is for March 2024,

00:15:52.860 --> 00:15:55.099
but it still has a lot of great information

00:15:55.100 --> 00:15:56.819
on how to benchmark the environments,

00:15:56.820 --> 00:16:01.339
what some values are like the Ryzen 5700U

00:16:01.340 --> 00:16:02.579
inside my Acer Aspire,

00:16:02.580 --> 00:16:04.419
that's where I got the idea doing that.

00:16:04.420 --> 00:16:06.739
Make sure you do the RockM stuff correctly

00:16:06.740 --> 00:16:09.899
to get the GUI extensions. But it's just really good stuff.

00:16:09.900 --> 00:16:13.059
You don't need a great GPU or CPU to get started.

00:16:13.060 --> 00:16:14.819
Smaller models like Tiny Llama

00:16:14.820 --> 00:16:16.179
can run on very small systems.

00:16:16.180 --> 00:16:18.499
It gets you the ability to start playing with it

00:16:18.500 --> 00:16:21.619
and start experimenting and figure out if that's for you

00:16:21.620 --> 00:16:23.379
and to move forward with it.

00:16:23.380 --> 00:16:29.219
The AMD Ryzen AI Max 395 plus is a mini PC

00:16:29.220 --> 00:16:31.179
makes it really nice dedicated host.

00:16:31.180 --> 00:16:34.619
You used to be able to buy these for about $1200 now

00:16:34.620 --> 00:16:35.579
with the RAM price increase,

00:16:35.580 --> 00:16:38.779
you want to get 120 gig when you're pushing two brands so.

00:16:38.780 --> 00:16:40.739
It gets a little tighter.

00:16:40.740 --> 00:16:44.099
Macs work remarkably well with AI.

00:16:44.100 --> 00:16:47.659
My MacBook Air was one of my go-tos for a while,

00:16:47.660 --> 00:16:49.779
but once I started doing anything AI,

00:16:49.780 --> 00:16:50.779
I had a five-minute window

00:16:50.780 --> 00:16:52.619
before the thermal throttling became an issue.

00:16:52.620 --> 00:16:54.619
Keep in mind that's a MacBook Air,

00:16:54.620 --> 00:16:56.659
so it doesn't have the greatest ventilation.

00:16:56.660 --> 00:16:58.339
If you get the MacBook Pros and stuff,

00:16:58.340 --> 00:17:00.139
they tend to have more ventilation,

00:17:00.140 --> 00:17:02.499
but still you're going to be pushing against that.

00:17:02.500 --> 00:17:04.939
So Mac Minis and the Mac Ultras and stuff like that

00:17:04.940 --> 00:17:06.099
tend to work really well for that.

00:17:06.100 --> 00:17:09.779
Alex Ziskin on YouTube has a channel.

00:17:09.780 --> 00:17:11.899
He does a lot of AI performance benchmarking,

00:17:11.900 --> 00:17:14.819
like I load a 70 billion parameter model

00:17:14.820 --> 00:17:16.699
on this mini PC and stuff like that.

00:17:16.700 --> 00:17:19.019
It's a lot of fun and interesting stuff there.

00:17:19.020 --> 00:17:21.219
And it's influencing my decision

00:17:21.220 --> 00:17:22.979
to buy my next AI style PC.

00:17:22.980 --> 00:17:27.619
Small domain specific LLMs are happening.

00:17:27.620 --> 00:17:29.939
An LLM that has all your code and information,

00:17:29.940 --> 00:17:31.659
it sounds like a really cool idea.

00:17:31.660 --> 00:17:34.299
It gives you capabilities to start training stuff

00:17:34.300 --> 00:17:35.899
that you couldn't do with like the big ones.

00:17:35.900 --> 00:17:38.059
Even with in terms of fine tuning and stuff,

00:17:38.060 --> 00:17:40.539
it's remarkable to see where that space is coming along

00:17:40.540 --> 00:17:41.739
in the next year or so.

00:17:41.740 --> 00:17:46.219
Hugging Face Co has pointers to tons of AI models.

00:17:46.220 --> 00:17:49.259
You'll find the one that works for you, hopefully there.

00:17:49.260 --> 00:17:50.539
If you're doing cybersecurity,

00:17:50.540 --> 00:17:52.059
there's a whole bunch out there for that,

00:17:52.060 --> 00:17:54.619
that have certain training on it, information.

00:17:54.620 --> 00:17:56.139
It's really good.

00:17:56.140 --> 00:18:00.099
One last thing to keep in mind is hallucinations are real.

00:18:00.100 --> 00:18:02.779
You will get BS back from the AI occasionally,

00:18:02.780 --> 00:18:05.179
so do validate everything you get from it.

00:18:05.180 --> 00:18:08.459
Don't be using it for court cases like some people have

00:18:08.460 --> 00:18:14.539
and run into those problems. So, That is my talk.

00:18:14.540 --> 00:18:17.219
What I would like you to get out of that is,

00:18:17.220 --> 00:18:21.859
if you haven't tried it, give GPTEL and LlamaFile a shot.

00:18:21.860 --> 00:18:23.979
Fire up a little small AI instance,

00:18:23.980 --> 00:18:27.339
play around with a little bit inside your Emacs,

00:18:27.340 --> 00:18:30.139
and see if it makes your life better. Hopefully it will.

00:18:30.140 --> 00:18:32.139
And I really hope you guys

00:18:32.140 --> 00:18:34.659
learned something from this talk. And thanks for listening.

00:18:34.660 --> 00:18:38.979
And the links are at the end of the talk, if you have any questions.

00:18:38.980 --> 00:18:42.739
Let me see if we got anything you want, Pat. You do.

00:18:42.740 --> 00:18:43.899
You've got a few questions.

00:18:43.900 --> 00:18:48.059
Hey, this is Corwin. Thank you so much. Thank you, Aaron.

00:18:48.060 --> 00:18:50.339
What an awesome talk this was, actually.

00:18:50.340 --> 00:18:52.179
If you don't have a camera,

00:18:52.180 --> 00:18:54.339
I can get away with not having one too.

00:18:54.340 --> 00:18:56.299
I've got, I'll turn the camera on.

00:18:56.300 --> 00:19:01.499
Okay. All right. I'll turn mine back on. Here I come.

00:19:01.500 --> 00:19:03.139
Yeah, so there are a few questions,

00:19:03.140 --> 00:19:04.579
but first let me say thank you

00:19:04.580 --> 00:19:06.339
for a really captivating talk.

00:19:06.340 --> 00:19:10.939
I think a lot of people will be empowered from this

00:19:10.940 --> 00:19:15.259
to try to do more with less, especially locally.

00:19:15.260 --> 00:19:20.179
concerned about the data center footprint,

00:19:20.180 --> 00:19:23.659
environmentally concerned

00:19:23.660 --> 00:19:26.979
about the footprint of LLM inside data centers.

00:19:26.980 --> 00:19:28.219
So just thinking about how we can

00:19:28.220 --> 00:19:32.419
put infrastructure we have at home to use

00:19:32.420 --> 00:19:34.019
and get more done with less.

00:19:34.020 --> 00:19:37.499
Yeah, the data center impact's interesting

00:19:37.500 --> 00:19:39.979
because there was a study a while ago.

00:19:39.980 --> 00:19:42.099
Someone said every time you do a Gemini query,

00:19:42.100 --> 00:19:45.019
it's like boiling a cup of water.

00:19:45.020 --> 00:19:48.619
Yeah, I've heard that one too. So do you want to, you know,

00:19:48.620 --> 00:19:51.699
I don't know how much direction you want.

00:19:51.700 --> 00:19:53.859
I'd be very happy to read out the questions for you.

00:19:53.860 --> 00:19:55.219
Yeah, that would be great.

00:19:55.220 --> 00:19:57.619
I'm having trouble getting to that tab.

00:19:57.620 --> 00:20:02.779
Okay, I'm there, so I'll put it into our chat too,

00:20:02.780 --> 00:20:07.419
so you can follow along if you'd like.

00:20:07.420 --> 00:20:11.219
The first question was, why is the David Bowie question

00:20:11.220 --> 00:20:12.219
a good one to start with?

00:20:12.220 --> 00:20:14.419
Does it have interesting failure conditions

00:20:14.420 --> 00:20:17.299
or what made you choose that?

00:20:17.300 --> 00:20:21.979
First off, huge fan of David Bowie.

00:20:21.980 --> 00:20:24.499
But I came down to it really taught me a few things

00:20:24.500 --> 00:20:26.299
about how old the models work

00:20:26.300 --> 00:20:28.819
in terms of things like how many kids he had,

00:20:28.820 --> 00:20:31.779
because deep seek, which is a very popular Chinese model

00:20:31.780 --> 00:20:33.179
that a lot of people are using now,

00:20:33.180 --> 00:20:35.619
misidentifies him having three daughters,

00:20:35.620 --> 00:20:38.459
and he has like one son and one, one, I think,

00:20:38.460 --> 00:20:40.899
two sons and a daughter or something like that.

00:20:40.900 --> 00:20:43.659
so there's differences on that and it just goes over

00:20:43.660 --> 00:20:45.299
there's a whole lot of stuff

00:20:45.300 --> 00:20:47.779
because his story spans like 60 years

00:20:47.780 --> 00:20:49.659
so it gives a good good feedback

00:20:49.660 --> 00:20:51.539
that's the real main reason I asked that question

00:20:51.540 --> 00:20:53.699
because I just needed one that sea monkeys I just picked

00:20:53.700 --> 00:20:56.579
because it was obscure and just always have right

00:20:56.580 --> 00:20:58.939
I used to have it right hello world and forth

00:20:58.940 --> 00:21:01.019
because I thought was an interesting one as well so

00:21:01.020 --> 00:21:03.899
It's just picking random ones like that.

00:21:03.900 --> 00:21:06.499
One question asked, sorry, a lot of models is,

00:21:06.500 --> 00:21:09.419
what is the closest star to the Earth?

00:21:09.420 --> 00:21:12.019
Because most of them will say Alpha Centauri

00:21:12.020 --> 00:21:13.739
or Proxima Centauri and not the sun.

00:21:13.740 --> 00:21:15.899
And I have a whole nother talk

00:21:15.900 --> 00:21:17.899
where I just argue with the LLM

00:21:17.900 --> 00:21:20.019
trying to say, hey, the sun is a star.

00:21:20.020 --> 00:21:26.579
And he just wouldn't accept it, so. What?

00:21:26.580 --> 00:21:28.419
Oh, I can hear that.

00:21:28.420 --> 00:21:34.379
So what specific tasks do you like to use your local AI?

00:21:34.380 --> 00:21:37.459
I like to load a lot of my code into

00:21:37.460 --> 00:21:39.739
and actually have it do analysis of it.

00:21:39.740 --> 00:21:42.339
I was actually going through some code

00:21:42.340 --> 00:21:45.619
I have for some pen testing, and I was having it modified

00:21:45.620 --> 00:21:47.259
to update it for the newer version,

00:21:47.260 --> 00:21:48.459
because I hate to say this,

00:21:48.460 --> 00:21:49.859
but it was written for Python 2,

00:21:49.860 --> 00:21:51.459
and I needed to update it for Python 3.

00:21:51.460 --> 00:21:53.859
And the 2 to 3 tool did not do all of it,

00:21:53.860 --> 00:21:56.659
but the actual tool was able to do the refactoring.

00:21:56.660 --> 00:21:58.499
It's part of my laziness.

00:21:58.500 --> 00:22:01.459
But I use that for anything I don't want to hit the web.

00:22:01.460 --> 00:22:03.259
And that's a lot of stuff when you start thinking about

00:22:03.260 --> 00:22:04.979
if you're doing cyber security researching.

00:22:04.980 --> 00:22:06.819
and you have your white papers

00:22:06.820 --> 00:22:10.779
and stuff like that and stuff in there.

00:22:10.780 --> 00:22:13.979
I've got a lot of that loaded into RAG

00:22:13.980 --> 00:22:15.659
in one model on my OpenWebUI system.

00:22:15.660 --> 00:22:21.059
Neat. Have you used have you used

00:22:21.060 --> 00:22:25.739
any small domain specific LLMs? What kind of tasks?

00:22:25.740 --> 00:22:30.419
If so, what kind of tasks that they specialize in?

00:22:30.420 --> 00:22:32.139
And you know, how?

00:22:32.140 --> 00:22:34.979
Not to be honest, but there are some out there like once again,

00:22:34.980 --> 00:22:36.779
for cybersecurity and stuff like that,

00:22:36.780 --> 00:22:39.739
that I really need to dig into that's on my to do list.

00:22:39.740 --> 00:22:41.699
I've got a couple weeks off at the end of the year.

00:22:41.700 --> 00:22:43.779
And that's a big part of my plan for that.

00:22:43.780 --> 00:22:49.379
Are the various models updated pretty regularly?

00:22:49.380 --> 00:22:52.059
Can you add your own data to the pre-built models?

00:22:52.060 --> 00:22:56.699
Yes. The models are updated pretty reasonably.

00:22:56.700 --> 00:22:59.699
You can add data to a model in a couple of different ways.

00:22:59.700 --> 00:23:01.099
You can do something called fine-tuning,

00:23:01.100 --> 00:23:03.819
which requires a really nice GPU and a lot of CPU time.

00:23:03.820 --> 00:23:05.499
Probably not going to do that.

00:23:05.500 --> 00:23:07.419
You can do retrieval augmentation generation,

00:23:07.420 --> 00:23:09.499
which is you load your data on top of the system

00:23:09.500 --> 00:23:11.299
and puts inside a database

00:23:11.300 --> 00:23:12.859
and you can actually scan that and stuff.

00:23:12.860 --> 00:23:14.619
I have another talk where I go through

00:23:14.620 --> 00:23:16.219
and I start asking questions about,

00:23:16.220 --> 00:23:18.579
I load the talk into the engine

00:23:18.580 --> 00:23:20.099
and I ask questions against that.

00:23:20.100 --> 00:23:22.179
I would have one more time would have done that

00:23:22.180 --> 00:23:26.499
but it comes down to how many That's that's rag rag

00:23:26.500 --> 00:23:29.419
is pretty easy to do through open web UI or LM studio

00:23:29.420 --> 00:23:31.419
It's a great way you just like point a folder

00:23:31.420 --> 00:23:34.099
point it to a folder and it just sucks all that state into

00:23:34.100 --> 00:23:35.499
and it'll hit that data first

00:23:35.500 --> 00:23:36.859
you have like helpdesk and stuff and

00:23:36.860 --> 00:23:39.619
The other options there's vector databases,

00:23:39.620 --> 00:23:41.819
which is like if you use PostgreSQL.

00:23:41.820 --> 00:23:43.699
It has a PG vector I can do a lot of that stuff.

00:23:43.700 --> 00:23:44.739
I've not dug into that yet,

00:23:44.740 --> 00:23:46.099
but that is also on that to-do list

00:23:46.100 --> 00:23:48.459
I've got a lot of stuff planned for Cool.

00:23:48.460 --> 00:23:51.819
So what are your experience with rags?

00:23:51.820 --> 00:23:54.339
I don't even know what that means.

00:23:54.340 --> 00:23:57.419
Do you know what that means?

00:23:57.420 --> 00:23:59.619
Do you remember this question again?

00:23:59.620 --> 00:24:03.979
What is your experience with RAGs? RAGs is great.

00:24:03.980 --> 00:24:07.459
That's Retrieval Augmentation Generation.

00:24:07.460 --> 00:24:09.739
That loads your data first, and it hits yours,

00:24:09.740 --> 00:24:11.499
and it'll actually cite it and stuff.

00:24:11.500 --> 00:24:14.659
There's a guy who wrote a RAG in 100 lines of Python,

00:24:14.660 --> 00:24:16.899
and it's an impressive piece of software.

00:24:16.900 --> 00:24:18.779
I think if you hit one of my site,

00:24:18.780 --> 00:24:22.099
I've got a private AI talk where I actually refer to that.

00:24:22.100 --> 00:24:25.219
But retrieval augmentation, it's easy, it's fast,

00:24:25.220 --> 00:24:26.699
it puts your data into the system,

00:24:26.700 --> 00:24:31.339
Yeah, start with that and go then iterate on top of that.

00:24:31.340 --> 00:24:32.659
That's one of the great things about AI,

00:24:32.660 --> 00:24:33.619
especially private AI,

00:24:33.620 --> 00:24:37.739
is you can do whatever you want to with it

00:24:37.740 --> 00:24:43.179
and build up with it as you get more experience.

00:24:43.180 --> 00:24:44.219
Any thoughts on running things

00:24:44.220 --> 00:24:49.179
on AWS, DigitalOcean, and so on?

00:24:49.180 --> 00:24:50.619
AWS is not bad.

00:24:50.620 --> 00:24:52.659
The DigitalOcean, they have some of their GPUs.

00:24:52.660 --> 00:24:54.379
I still don't like having the data

00:24:54.380 --> 00:24:57.419
leave my house, to be honest, or at work,

00:24:57.420 --> 00:24:59.019
because I tend to do some stuff

00:24:59.020 --> 00:25:01.259
that I don't want it even hitting that situation.

00:25:01.260 --> 00:25:03.699
But they have pretty good stuff.

00:25:03.700 --> 00:25:05.579
Another one to consider is Oracle Cloud.

00:25:05.580 --> 00:25:09.059
Oracle has their AI infrastructure that's really well done.

00:25:09.060 --> 00:25:12.379
But I mean, once again, then you start looking at potential

00:25:12.380 --> 00:25:13.779
is saying your data is private,

00:25:13.780 --> 00:25:14.819
I don't necessarily trust it.

00:25:14.820 --> 00:25:17.859
But they do have good stuff, both DigitalOcean, AWS,

00:25:17.860 --> 00:25:20.339
Oracle Cloud has the free service, which isn't too bad,

00:25:20.340 --> 00:25:21.339
usually a certain number of stuff.

00:25:21.340 --> 00:25:23.179
And Google's also has it,

00:25:23.180 --> 00:25:26.739
but I still tend to keep more stuff on local PCs,

00:25:26.740 --> 00:25:33.299
because I just paranoid that way. Gotcha.

00:25:33.300 --> 00:25:35.579
What has your experience been using AI?

00:25:35.580 --> 00:25:40.139
Do you want to get into that, using AI for cybersecurity?

00:25:40.140 --> 00:25:42.019
You might have already touched on this.

00:25:42.020 --> 00:25:44.379
Yeah, really, for cybersecurity,

00:25:44.380 --> 00:25:46.259
what I've had to do is I've dumped logs

00:25:46.260 --> 00:25:47.299
to have a due correlation.

00:25:47.300 --> 00:25:49.859
Keep in mind, the size of that LLAMA file we were using

00:25:49.860 --> 00:25:52.059
for figuring out David Bowie, writing the hello world,

00:25:52.060 --> 00:25:54.179
all that stuff, is like six gig.

00:25:54.180 --> 00:25:56.859
How does it get the entire world in six gig?

00:25:56.860 --> 00:25:59.739
I still haven't figured that out in terms of quantization.

00:25:59.740 --> 00:26:02.499
So I'm really interested in seeing the ability

00:26:02.500 --> 00:26:05.139
to take all this stuff out of all my logs,

00:26:05.140 --> 00:26:06.339
dump it all in there,

00:26:06.340 --> 00:26:08.459
and actually be able to do intelligent queries against that.

00:26:08.460 --> 00:26:10.899
Microsoft has a project called Security Copilot,

00:26:10.900 --> 00:26:12.819
which is trying to do that in the Cloud.

00:26:12.820 --> 00:26:15.299
But I want to work on something to do that more locally

00:26:15.300 --> 00:26:19.019
and be able to actually drive this stuff over that.

00:26:19.020 --> 00:26:21.979
That's one also on the long-term goals.

00:26:21.980 --> 00:26:26.059
So we got any other questions or?

00:26:26.060 --> 00:26:29.099
Those are the questions that I see.

00:26:29.100 --> 00:26:31.179
I want to just read out a couple of comments

00:26:31.180 --> 00:26:33.419
that I saw in IRC though.

00:26:33.420 --> 00:26:36.699
Jay Rutabaga says, it went very well

00:26:36.700 --> 00:26:39.259
from an audience perspective.

00:26:39.260 --> 00:26:43.619
And G Gundam says, respect your commitment to privacy.

00:26:43.620 --> 00:26:45.619
And then somebody is telling us

00:26:45.620 --> 00:26:46.779
we might have skipped a question.

00:26:46.780 --> 00:26:50.019
So I'm just going to run back to my list.

00:26:50.020 --> 00:26:52.819
Updated regularly experience.

00:26:52.820 --> 00:26:57.659
I just didn't type in the answer here's

00:26:57.660 --> 00:26:59.659
and there's a couple more questions coming in so

00:26:59.660 --> 00:27:04.699
Is there a disparity where you go to paid models

00:27:04.700 --> 00:27:08.619
because they are better and what problems?

00:27:08.620 --> 00:27:14.019
You know what would drive you to? That's a good question.

00:27:14.020 --> 00:27:17.819
Paid models, I don't mind them. I think they're good,

00:27:17.820 --> 00:27:21.299
but I don't think they're actually economically sustainable

00:27:21.300 --> 00:27:22.659
under their current system.

00:27:22.660 --> 00:27:24.299
Because right now, if you're paying

00:27:24.300 --> 00:27:26.899
20 bucks a month for Copilot and that goes up to 200 bucks,

00:27:26.900 --> 00:27:28.499
I'm not going to be as likely to use it.

00:27:28.500 --> 00:27:29.579
You know what I mean?

00:27:29.580 --> 00:27:33.059
But it does do some things in a way that I did not expect.

00:27:33.060 --> 00:27:35.459
For example, Grok was refactoring

00:27:35.460 --> 00:27:38.019
some of my code in the comments and dropped an F-bomb.

00:27:38.020 --> 00:27:39.979
which I did not see coming,

00:27:39.980 --> 00:27:41.619
but the other code before

00:27:41.620 --> 00:27:43.219
that I had gotten off GitHub

00:27:43.220 --> 00:27:44.059
had F bombs in it.

00:27:44.060 --> 00:27:45.899
So it was just emulating the style,

00:27:45.900 --> 00:27:47.779
but would that be something

00:27:47.780 --> 00:27:49.979
I'd want to turn in a pull request? I don't know.

00:27:49.980 --> 00:27:52.139
But, uh, there's, there's a lot of money

00:27:52.140 --> 00:27:53.899
going into these AIs and stuff,

00:27:53.900 --> 00:27:56.219
but in terms of the ability to get a decent one,

00:27:56.220 --> 00:27:57.979
like the llama, llama three, two,

00:27:57.980 --> 00:28:01.699
and load your data into it, you can be pretty competitive.

00:28:01.700 --> 00:28:04.779
You're not going to get all the benefits,

00:28:04.780 --> 00:28:07.299
but you have more control over it.

00:28:07.300 --> 00:28:11.819
So it's, it's a, this and that it's a,

00:28:11.820 --> 00:28:13.139
it's a balancing act.

00:28:13.140 --> 00:28:15.539
Okay, and I think I see a couple more questions coming in.

00:28:15.540 --> 00:28:19.619
What is the largest parameter size for local models

00:28:19.620 --> 00:28:22.459
that you've been able to successfully run locally

00:28:22.460 --> 00:28:26.059
and do run into issues with limited context window size?

00:28:26.060 --> 00:28:29.659
The top eight models will tend to have a larger ceiling.

00:28:29.660 --> 00:28:32.859
Yes, yes, yes, yes, yes.

00:28:32.860 --> 00:28:37.019
By default, the context size is I think 1024.

00:28:37.020 --> 00:28:44.619
But I've upped it to 8192 on the on this box, the Pangolin

00:28:44.620 --> 00:28:46.939
because it seems to be some reason

00:28:46.940 --> 00:28:49.459
it's just a very working quite well.

00:28:49.460 --> 00:28:52.219
But the largest ones I've loaded have been in

00:28:52.220 --> 00:28:54.059
the have not been that huge.

00:28:54.060 --> 00:28:55.699
I've loaded this the last biggest one I've done.

00:28:55.700 --> 00:28:57.459
That's the reason why I'm planning

00:28:57.460 --> 00:29:01.339
on breaking down and buying a Ryzen.

00:29:01.340 --> 00:29:03.619
Actually, I'm going to buy

00:29:03.620 --> 00:29:06.979
an Intel i285H with 96 gig of RAM.

00:29:06.980 --> 00:29:08.379
Then I should be able to load

00:29:08.380 --> 00:29:12.059
a 70 billion parameter model in that. How fast will it run?

00:29:12.060 --> 00:29:13.819
It's going to run slow as dog,

00:29:13.820 --> 00:29:15.819
but it's going to be cool to be able to do it.

00:29:15.820 --> 00:29:17.379
It's an AI bragging rights thing,

00:29:17.380 --> 00:29:20.019
but I mostly stick with the smaller size models

00:29:20.020 --> 00:29:22.819
and the ones that are more quantitized

00:29:22.820 --> 00:29:26.619
because it just tends to work better for me.

00:29:26.620 --> 00:29:29.179
We've still got over 10 minutes before we're cutting away,

00:29:29.180 --> 00:29:30.179
but I'm just anticipating

00:29:30.180 --> 00:29:32.859
that we're going to be going strong at the 10 minute mark.

00:29:32.860 --> 00:29:34.899
So I'm just, just letting, you know,

00:29:34.900 --> 00:29:37.379
we can go as long as we like here at a certain point.

00:29:37.380 --> 00:29:41.059
I may have to jump away and check in with the next speaker,

00:29:41.060 --> 00:29:44.419
but we'll post the entirety of this,

00:29:44.420 --> 00:29:47.979
even if we aren't able to stay with it all.

00:29:47.980 --> 00:29:49.739
Okay. And we've got 10 minutes

00:29:49.740 --> 00:29:52.379
where we're still going to stay live.

00:29:52.380 --> 00:30:00.139
So next question coming in, I see, are there free as in freedom,

00:30:00.140 --> 00:30:05.739
free as in FSF issues with the data?

00:30:05.740 --> 00:30:11.699
Yes, where's the data coming from is a huge question with AI.

00:30:11.700 --> 00:30:13.739
It's astonishing you can ask questions

00:30:13.740 --> 00:30:16.899
to models that you don't know where it's coming from.

00:30:16.900 --> 00:30:19.979
That is gonna be one of the big issues long-term.

00:30:19.980 --> 00:30:21.499
There are people who are working

00:30:21.500 --> 00:30:22.979
on trying to figure out that stuff,

00:30:22.980 --> 00:30:25.259
but it's, I mean, if you look at, God,

00:30:25.260 --> 00:30:27.059
I can't remember who it was.

00:30:27.060 --> 00:30:28.659
Somebody was actually out torrenting books

00:30:28.660 --> 00:30:30.939
just to be able to build into their AI system.

00:30:30.940 --> 00:30:32.339
I think it might've been Meta.

00:30:32.340 --> 00:30:34.819
So there's a lot of that going on.

00:30:34.820 --> 00:30:38.139
The open source of the stuff is going to be tough.

00:30:38.140 --> 00:30:39.459
There's going to be there's some models

00:30:39.460 --> 00:30:41.419
like the mobile guys have got their own license,

00:30:41.420 --> 00:30:42.739
but where they're getting their data from,

00:30:42.740 --> 00:30:45.499
I'm not sure on so that that's a huge question.

00:30:45.500 --> 00:30:47.979
That's a that's a talk in itself.

00:30:47.980 --> 00:30:51.979
But yeah, but you if you train on your RAG and your data,

00:30:51.980 --> 00:30:53.499
you know what it's come, you know,

00:30:53.500 --> 00:30:54.379
you have a license that

00:30:54.380 --> 00:30:55.139
but the other stuff is just

00:30:55.140 --> 00:30:56.739
more lines of supplement

00:30:56.740 --> 00:31:01.379
if you're using a smaller model,

00:31:01.380 --> 00:31:05.419
but the comment online, I see a couple of them.

00:31:05.420 --> 00:31:08.339
I'll read them out in order here. Really interesting stuff.

00:31:08.340 --> 00:31:11.659
Thank you for your talk. Given that large AI companies

00:31:11.660 --> 00:31:14.899
are openly stealing intellectual property and copyright

00:31:14.900 --> 00:31:18.939
and therefore eroding the authority of such laws

00:31:18.940 --> 00:31:21.579
and maybe obscuring the truth itself,

00:31:21.580 --> 00:31:26.579
can you see a future where IP and copyright flaw become untenable?

00:31:26.580 --> 00:31:29.619
I think that's a great question.

00:31:29.620 --> 00:31:34.979
I'm not a lawyer, but it is really getting complicated.

00:31:34.980 --> 00:31:37.859
It is getting to the point, I asked a question from,

00:31:37.860 --> 00:31:41.179
I played with Sora a little bit, and it generated someone,

00:31:41.180 --> 00:31:42.819
you can go like, oh, that's Jon Hamm,

00:31:42.820 --> 00:31:44.099
that's Christopher Walken,

00:31:44.100 --> 00:31:45.379
you start figuring out who the people

00:31:45.380 --> 00:31:47.019
they're modeling stuff after.

00:31:47.020 --> 00:31:48.979
There is an apocalypse, something

00:31:48.980 --> 00:31:52.459
going to happen right now.

00:31:52.460 --> 00:31:53.579
There is, but this is once again,

00:31:53.580 --> 00:31:56.059
my personal opinion, and I'm not a lawyer,

00:31:56.060 --> 00:31:57.459
and I do not have money.

00:31:57.460 --> 00:31:58.859
So don't sue me, is there's going to be

00:31:58.860 --> 00:32:02.899
the current administration tends is very AI pro AI.

00:32:02.900 --> 00:32:05.499
And there's very a great deal of lobbying by those groups.

00:32:05.500 --> 00:32:07.139
And it's on both sides.

00:32:07.140 --> 00:32:09.699
And it's going to be, it's gonna be interesting to see

00:32:09.700 --> 00:32:11.699
what happens to copyright the next 510 years.

00:32:11.700 --> 00:32:13.339
I just don't know how it keeps up

00:32:13.340 --> 00:32:16.059
without there being some adjustments and stuff.

00:32:16.060 --> 00:32:20.419
Okay, and then another comment I saw,

00:32:20.420 --> 00:32:23.219
file size is not going to be a bottleneck.

00:32:23.220 --> 00:32:25.819
RAM is. You'll need 16 gigabytes of RAM

00:32:25.820 --> 00:32:28.259
to run the smallest local models

00:32:28.260 --> 00:32:31.979
and 512 gigabytes of RAM to run the larger ones.

00:32:31.980 --> 00:32:35.059
You'll need a GPU with that much memory

00:32:35.060 --> 00:32:39.099
if you want it to run quickly. Yeah. Oh no.

00:32:39.100 --> 00:32:41.259
It also depends upon how your memory is laid out.

00:32:41.260 --> 00:32:45.699
Like example being the Ultra i285H

00:32:45.700 --> 00:32:47.899
I plan to buy, that has 96 gig of memory.

00:32:47.900 --> 00:32:50.499
It's unified between the GPU and the CPU share it,

00:32:50.500 --> 00:32:52.739
but they go over the same bus.

00:32:52.740 --> 00:32:55.779
So the overall bandwidth of it tends to be a bit less,

00:32:55.780 --> 00:32:57.579
but you're able to load more of it into memory.

00:32:57.580 --> 00:32:59.419
So it's able to do some additional stuff with it

00:32:59.420 --> 00:33:00.819
as opposed to come off disk.

00:33:00.820 --> 00:33:03.699
It's all balancing act. If you hit Zyskin's website,

00:33:03.700 --> 00:33:05.819
that guy's done some great work on it.

00:33:05.820 --> 00:33:07.499
I'm trying to figure out how big a model you can do,

00:33:07.500 --> 00:33:08.619
what you can do with it.

00:33:08.620 --> 00:33:12.699
And some of the stuff seems to be not obvious,

00:33:12.700 --> 00:33:15.299
because like example, being that MacBook Air,

00:33:15.300 --> 00:33:17.619
for the five minutes I can run the model,

00:33:17.620 --> 00:33:19.379
it runs it faster than a lot of other things

00:33:19.380 --> 00:33:21.339
that should be able to run it faster,

00:33:21.340 --> 00:33:24.619
just because of the way the ARM cores and the unified memory work on it.

00:33:24.620 --> 00:33:26.019
So it's a learning process.

00:33:26.020 --> 00:33:29.579
But if you want to, Network Chuck had a great video

00:33:29.580 --> 00:33:30.939
talking about building his own system

00:33:30.940 --> 00:33:34.379
with a couple really powerful NVIDIA cards

00:33:34.380 --> 00:33:35.379
and stuff like that in it.

00:33:35.380 --> 00:33:38.859
And just actually setting up on his system as a node

00:33:38.860 --> 00:33:41.459
and using a web UI on it. So there's a lot of stuff there,

00:33:41.460 --> 00:33:43.899
but it is a process of learning how big your data is,

00:33:43.900 --> 00:33:44.899
which models you want to use,

00:33:44.900 --> 00:33:46.219
how much information you need,

00:33:46.220 --> 00:33:48.019
but it's part of the learning.

00:33:48.020 --> 00:33:52.899
And you can run models, even as a Raspberry PI fives,

00:33:52.900 --> 00:33:54.499
if you want to, they'll run slow.

00:33:54.500 --> 00:33:56.459
Don't get me wrong, but they're possible.

00:33:56.460 --> 00:34:02.179
Okay, and I think there's other questions coming in too,

00:34:02.180 --> 00:34:04.019
so I'll just bam for another second.

00:34:04.020 --> 00:34:06.299
We've got about five minutes before we'll,

00:34:06.300 --> 00:34:09.739
before we'll be cutting over,

00:34:09.740 --> 00:34:13.179
but I just want to say in case we get close for time here,

00:34:13.180 --> 00:34:14.859
how much I appreciate your talk.

00:34:14.860 --> 00:34:15.979
This is another one that I'm going to

00:34:15.980 --> 00:34:18.339
have to study after the conference.

00:34:18.340 --> 00:34:21.099
We greatly appreciate, all of us appreciate

00:34:21.100 --> 00:34:22.459
you guys putting on the conference.

00:34:22.460 --> 00:34:26.299
It's a great conference. It's well done.

00:34:26.300 --> 00:34:28.019
It's an honor to be on the stage

00:34:28.020 --> 00:34:30.899
with the brains of the project, which is you.

00:34:30.900 --> 00:34:34.699
So what else we got? Question wise.

00:34:34.700 --> 00:34:39.499
Okay, so just scanning here.

00:34:39.500 --> 00:34:50.699
Have you used local models capable of tool calling?

00:34:50.700 --> 00:34:54.779
I'm, I'm scared of agentic.

00:34:54.780 --> 00:34:58.739
I, I am, I'm going to be a slow adopter of that.

00:34:58.740 --> 00:35:02.459
I want to do it, but I just don't have the, uh,

00:35:02.460 --> 00:35:04.339
four decimal fortitude right now to do it.

00:35:04.340 --> 00:35:07.179
I, I, I've had to give me the commands,

00:35:07.180 --> 00:35:08.739
but I still run the commands by hand.

00:35:08.740 --> 00:35:10.539
I'm looking into it and it's on once again,

00:35:10.540 --> 00:35:14.139
it's on that list, but I just, that's a big step for me.

00:35:14.140 --> 00:35:23.139
So. Awesome. All right.

00:35:23.140 --> 00:35:27.179
Well, maybe it's, let me just scroll through

00:35:27.180 --> 00:35:31.539
because we might have missed one question. Oh, I see.

00:35:31.540 --> 00:35:36.899
Here was the piggyback question.

00:35:36.900 --> 00:35:38.419
Now I see the question that I missed.

00:35:38.420 --> 00:35:41.139
So this was piggybacking on the question

00:35:41.140 --> 00:35:44.859
about model updates and adding data.

00:35:44.860 --> 00:35:46.579
And will models reach out to the web

00:35:46.580 --> 00:35:47.819
if they need more info?

00:35:47.820 --> 00:35:51.779
Or have you worked with any models that work that way?

00:35:51.780 --> 00:35:55.259
No, I've not seen any models to do that

00:35:55.260 --> 00:35:57.739
There's there was like a group

00:35:57.740 --> 00:35:59.899
working on something like a package updater

00:35:59.900 --> 00:36:02.499
that would do different diffs on it,

00:36:02.500 --> 00:36:03.939
but it's so Models change so much

00:36:03.940 --> 00:36:05.739
even who make minor changes and fine-tuning.

00:36:05.740 --> 00:36:07.659
It's hard just to update them in place

00:36:07.660 --> 00:36:10.099
So I haven't seen one, but that doesn't mean

00:36:10.100 --> 00:36:16.259
they're not out there. I'm curious topic though Awesome

00:36:16.260 --> 00:36:19.539
Well, it's probably pretty good timing.

00:36:19.540 --> 00:36:21.299
Let me just scroll and make sure.

00:36:21.300 --> 00:36:23.499
And of course, before I can say that,

00:36:23.500 --> 00:36:25.899
there's one more question. So let's go ahead and have that.

00:36:25.900 --> 00:36:28.299
I want to make sure while we're still live, though,

00:36:28.300 --> 00:36:31.299
I give you a chance to offer any closing thoughts.

00:36:31.300 --> 00:36:35.779
So what scares you most about the agentic tools?

00:36:35.780 --> 00:36:38.419
How would you think about putting a sandbox around that

00:36:38.420 --> 00:36:42.139
if you did adopt an agentic workflow?

00:36:42.140 --> 00:36:42.899
That is a great question.

00:36:42.900 --> 00:36:45.939
In terms of that, I would just control

00:36:45.940 --> 00:36:48.099
what it's able to talk to, what machines,

00:36:48.100 --> 00:36:50.059
I would actually have it be air gap.

00:36:50.060 --> 00:36:52.099
I work for a defense contractor,

00:36:52.100 --> 00:36:53.819
and we spend a lot of time dealing with air gap systems,

00:36:53.820 --> 00:36:55.979
because that's just kind of the way it works out for us.

00:36:55.980 --> 00:36:58.499
So agentic, it's just going to take a while to get trust.

00:36:58.500 --> 00:37:01.059
I want to want to see more stuff happening.

00:37:01.060 --> 00:37:02.819
Humans screw up stuff enough.

00:37:02.820 --> 00:37:04.819
The last thing we need is to multiply that by 1000.

00:37:04.820 --> 00:37:09.419
So in terms of that, I would be restricting what it can do.

00:37:09.420 --> 00:37:10.859
If you look at the capabilities,

00:37:10.860 --> 00:37:13.579
if I created a user and gave it permissions,

00:37:13.580 --> 00:37:15.299
I would have a lockdown through sudo,

00:37:15.300 --> 00:37:17.379
what it's able to do, what the account's able to do.

00:37:17.380 --> 00:37:18.899
I would do those kind of things,

00:37:18.900 --> 00:37:20.859
but it's going to be, it's happening.

00:37:20.860 --> 00:37:25.819
It's just, I'm going to be one of the laggards on that one.

00:37:25.820 --> 00:37:29.259
So airgab, jail, extremely locked down environments,

00:37:29.260 --> 00:37:34.899
like we're talking about separate physicals, not Docker.

00:37:34.900 --> 00:37:37.499
Yeah, hopefully. Right, fair.

00:37:37.500 --> 00:37:39.899
So tool calling can be read-only,

00:37:39.900 --> 00:37:42.539
such as giving models the ability to search the web

00:37:42.540 --> 00:37:43.979
before answering your question,

00:37:43.980 --> 00:37:46.219
you know, write access, execute access.

00:37:46.220 --> 00:37:49.219
I'm interested to know if local models

00:37:49.220 --> 00:37:51.419
are any good at that.

00:37:51.420 --> 00:37:55.579
Yes, local models can do a lot of that stuff.

00:37:55.580 --> 00:37:56.819
It's their capabilities.

00:37:56.820 --> 00:37:59.019
If you load LM studio, you can do a lot of wonderful stuff

00:37:59.020 --> 00:38:02.419
with that or with open web UI with a llama.

00:38:02.420 --> 00:38:05.739
It's a lot of capabilities. It's amazing.

00:38:05.740 --> 00:38:08.139
Open web UI is actually what a lot of companies are using now

00:38:08.140 --> 00:38:10.259
to put their data behind that.

00:38:10.260 --> 00:38:12.139
They're curated data and stuff like that. So works well.

00:38:12.140 --> 00:38:15.819
I can confirm that from my own professional experience.

00:38:15.820 --> 00:38:19.659
Excellent. Okay, well, our timing should be just perfect

00:38:19.660 --> 00:38:22.659
if you want to give us like a 30-second, 45-second wrap-up.

00:38:22.660 --> 00:38:24.419
Aaron, let me squeeze in mine.

00:38:24.420 --> 00:38:26.779
Thank you again so much for preparing this talk

00:38:26.780 --> 00:38:30.499
and for entertaining all of our questions.

00:38:30.500 --> 00:38:33.299
Yeah, let me just thank you guys for the conference again.

00:38:33.300 --> 00:38:35.179
This is a great one. I've enjoyed a lot of it.

00:38:35.180 --> 00:38:37.339
I've only had a couple of talks so far,

00:38:37.340 --> 00:38:41.659
but I'm looking forward to hitting the ones after this and tomorrow.

00:38:41.660 --> 00:38:44.739
But the AI stuff is coming. Get on board.

00:38:44.740 --> 00:38:46.939
Definitely recommend it. If you want to just try it out

00:38:46.940 --> 00:38:48.419
and get a little taste of it,

00:38:48.420 --> 00:38:49.779
what my minimal viable product

00:38:49.780 --> 00:38:51.619
with just LlamaFile and GPTEL

00:38:51.620 --> 00:38:53.139
will get you to the point where you start figuring out.

00:38:53.140 --> 00:38:55.579
Gptel is an amazing thing. It just gets out of your way,

00:38:55.580 --> 00:39:00.459
but it works solo with Emacs. Design because it takes

00:39:00.460 --> 00:39:01.699
doesn't take your hands off the keyboard.

00:39:01.700 --> 00:39:02.499
It's just another buffer

00:39:02.500 --> 00:39:04.059
and you just put information in there.

00:39:04.060 --> 00:39:06.979
It's quite quite a wonderful It's a wonderful time.

00:39:06.980 --> 00:39:10.819
Let's put that way That's all I got Thank you

00:39:10.820 --> 00:39:14.339
so much for once again, and we're we're just cut away.

00:39:14.340 --> 00:39:15.779
So I'll stop the recording

00:39:15.780 --> 00:39:18.259
and you're on your own recognizance

00:39:18.260 --> 00:39:19.699
Well, I'm gonna punch out

00:39:19.700 --> 00:39:21.059
if anybody has any questions or anything

00:39:21.060 --> 00:39:24.699
my email address is ajgrothe@yahoo.com or at gmail and

00:39:24.700 --> 00:39:26.779
Thank you all for attending

00:39:26.780 --> 00:39:29.939
and thanks again for the conference

00:39:29.940 --> 00:39:32.579
Okay, I'm gonna go ahead and end the room there, thank you.

00:39:32.580 --> 00:39:34.100
Excellent, thanks, bye.