summaryrefslogtreecommitdiffstats
path: root/2022/captions/emacsconf-2022-grail--graila-generalized-representation-and-aggregation-of-information-layers--sameer-pradhan--answers.vtt
blob: 7157036e4ecfa1ec1844ced5dd3bac4dc6547e67 (plain) (blame)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
WEBVTT

00:00.000 --> 00:02.060
you

01:00.000 --> 01:02.000
You

01:09.080 --> 01:13.280
Wait, so there's a little something though. How are we gonna get the audio?

01:16.240 --> 01:18.240
Give me just a second

01:18.240 --> 01:30.520
Ah, so can we just get a confirmation? Can you start playing the talk now Samir, please?

01:43.280 --> 01:47.080
So for everyone on the stream bear with us just a little bit we're trying to get it working right now

01:47.080 --> 01:49.180
I'm not getting any audio from you Samir

01:53.880 --> 01:59.600
Samir you might want to unmute yourself onto BBB. So if you could pause the video go to BBB and unmute yourself

02:09.320 --> 02:14.080
Okay, Samir, can you hear me now? Yeah, okay, so

02:14.080 --> 02:17.640
Oh, let me start. Where is it? Okay. There we go

02:23.320 --> 02:29.400
That sounds great, okay, we'll give you just a second to get squared away here and thanks everybody on the stream for bearing with us

02:30.780 --> 02:34.600
Okay, sure Samir. Can you now start playing your talk? Yeah

02:34.600 --> 02:42.440
I'm Samir Pradhan from the Linguistic Data Consortium at the University of Pennsylvania. Can you pause the talk for a second?

02:44.760 --> 02:46.760
What happened?

02:48.760 --> 02:52.600
Oh, you don't have audio. The thing was no audio is

02:54.760 --> 02:56.760
Oh

02:56.760 --> 03:02.760
Okay, Samir, sorry, we were just doing some last-minute checks. So yes do exactly the same thing as you did will be fine and we'll manage on our end

03:05.760 --> 03:08.760
Sorry everyone on the stream. We're just trying to do some last-minute shuffling

03:10.760 --> 03:14.760
And you are muted on BBB right now, so you will probably need to pause the talk for a second

03:14.760 --> 03:24.760
And you are muted on BBB right now, so you will probably need to unmute yourself on BBB and then start the talk

03:30.760 --> 03:35.760
So Samir, right now, sorry, could you, no, it's not working. You need to unmute yourself on BBB

03:35.760 --> 03:37.760
So right now you need to click the button, the microphone

03:37.760 --> 03:42.760
Yes, you toggled it off again. Toggle it on again, please

03:46.760 --> 03:48.760
What am I doing wrong?

03:49.760 --> 03:55.760
So do not unmute yourself now. Leave your microphone on and press, go back to the beginning of your video and press play

03:55.760 --> 03:57.760
Yes, from various signals

03:58.760 --> 04:01.760
The work we present is limited to a limited number of people

04:01.760 --> 04:06.760
So do not unmute yourself now. Leave your microphone on and press, go back to the beginning of your video and press play

04:06.760 --> 04:08.760
Yes, from various signals

04:09.760 --> 04:12.760
The work we present is limited to text and speech

04:12.760 --> 04:13.760
Good approaching

04:13.760 --> 04:14.760
But it can be extended

04:15.760 --> 04:22.760
Thank you for joining me today. I am Samir Pradhan from the Linguistic Data Consortium at the University of Pennsylvania

04:23.760 --> 04:26.760
And founder of osmantics.org

04:26.760 --> 04:33.760
We research in computational linguistics, also known as natural language processing, a sub-area of artificial intelligence

04:34.760 --> 04:40.760
With a focus on modeling and predicting complex linguistic structures from various signals

04:41.760 --> 04:47.760
The work we present is limited to text and speech, but it can be extended to other signals

04:47.760 --> 04:57.760
We propose an architecture, and we call it GRAIL, which allows the representation and aggregation of such rich structures in a systematic fashion

04:59.760 --> 05:11.760
I'll demonstrate a proof of concept for representing and manipulating data and annotations for the specific purpose of building machine learning models that simulate understanding

05:11.760 --> 05:20.760
These technologies have the potential for impact in almost any conceivable field that generates and uses data

05:22.760 --> 05:32.760
We process human language when our brains receive and assimilate various signals, which are then manipulated and interpreted within a syntactic structure

05:33.760 --> 05:39.760
It's a complex process that I have simplified here for the purpose of comparison to machine learning

05:39.760 --> 05:51.760
Recent machine learning models tend to require a large amount of raw, naturally occurring data, and a varying amount of manually enriched data, commonly known as annotations

05:52.760 --> 06:01.760
Owing to the complex and numerous nature of linguistic phenomena, we have most often used a divide-and-conquer approach

06:01.760 --> 06:09.760
The strength of this approach is that it allows us to focus on a single or perhaps a few related linguistic phenomena

06:10.760 --> 06:17.760
The weaknesses are the universe of these phenomena keep expanding as language itself evolves and changes over time

06:18.760 --> 06:26.760
And second, this approach requires an additional task of aggregating annotations, creating more opportunities for computer error

06:26.760 --> 06:40.760
Our challenge then is to find the sweet spot that allows us to encode complex information without the use of manual annotation or without the additional task of aggregation by computers

06:42.760 --> 06:45.760
So what do I mean by annotation?

06:45.760 --> 06:59.760
In this talk, the word annotation refers to the manual assignment of certain attributes to portions of a signal which is necessary to perform the end task

07:00.760 --> 07:11.760
For example, in order for the algorithm to accurately interpret a pronoun, it needs to know what that pronoun refers back to

07:11.760 --> 07:19.760
We may find this task trivial, however, current algorithms repeatedly fail in this task

07:20.760 --> 07:26.760
So the complexities of understanding in computational linguistics require annotation

07:27.760 --> 07:36.760
The word annotation itself is a useful example because it also reminds us that words have multiple meanings, as annotation itself does

07:36.760 --> 07:51.760
Just as I needed to define it in this context so that my message won't be misinterpreted, so too must annotators do at least for algorithms through manual intervention

07:52.760 --> 07:58.760
Learning from raw data, commonly known as unsupervised learning, poses limitations for machine learning

07:59.760 --> 08:04.760
As I described, modeling complex phenomena need manual annotations

08:04.760 --> 08:10.760
The learning algorithm uses these annotations as examples to build statistical models

08:11.760 --> 08:13.760
This is called supervised learning

08:13.760 --> 08:37.760
Without going into too much detail, I'll simply note that the recent popularity of the concept of deep learning is an evolutionary step where we have learned to train models using trillions of parameters in ways that they can learn richer hierarchical structures from very large amounts of annotated data

08:37.760 --> 08:49.760
These models can then be fine-tuned using varying amounts of annotated examples, depending on the complexity of the task, to generate better predictions

08:50.760 --> 09:01.760
As you might imagine, manually annotating complex linguistic phenomena can be a very specific, labor-intensive task

09:01.760 --> 09:09.760
For example, imagine if we were to go back through this presentation and connect all the pronouns with the nouns to which they refer

09:10.760 --> 09:14.760
Even for a short, 18-minute presentation, this would require hundreds of annotations

09:15.760 --> 09:20.760
The models we build are only as good as the quality of the annotations we make

09:20.760 --> 09:31.760
We need guidelines that ensure that the annotations are done by at least two humans who have substantial agreement with each other in their interpretations

09:32.760 --> 09:40.760
We know that if we try to train a model using annotations that are very subjective or have more noise, we will receive poor predictions

09:41.760 --> 09:47.760
Additionally, there is the concern of introducing various unexpected biases into one's models

09:47.760 --> 09:54.760
So, annotation is really both an art and a science

09:55.760 --> 09:59.760
In the remaining time, we will turn to two fundamental questions

10:00.760 --> 10:09.760
First, how can we develop a unified representation of data and annotations that encompasses arbitrary levels of linguistic information?

10:10.760 --> 10:14.760
There is a long history of attempting to answer this first question

10:14.760 --> 10:18.760
This history is documented in our recent article

10:19.760 --> 10:26.760
It is as if we as a community have been searching for our own holy grail

10:27.760 --> 10:35.760
The second question we will pose on is what role might Emacs, along with Org Mode, play in this process?

10:35.760 --> 10:46.760
While the solution itself may not be tied to Emacs, Emacs has built-in capabilities that could be useful for evaluating potential solutions

10:47.760 --> 10:55.760
It is also one of the most extensively documented pieces of software and the most customizable piece of software that I have ever come across

10:56.760 --> 11:00.760
Many would agree with that

11:00.760 --> 11:08.760
In order to approach this second question, we turn to the complex structure of language itself

11:09.760 --> 11:13.760
At first glance, language appears to us as a series of words

11:14.760 --> 11:20.760
Words form sentences, sentences form paragraphs, and paragraphs form completed texts

11:20.760 --> 11:30.760
If this was a sufficient description of the complexity of language, all of us would be able to read at least 10 different languages

11:31.760 --> 11:33.760
We know it is much more complex than this

11:34.760 --> 11:37.760
There is a rich, underlying, recursive tree structure

11:38.760 --> 11:45.760
In fact, many possible tree structures, which makes a particular sequence, and many others

11:45.760 --> 11:50.760
One of the better understood tree structures is the syntactic structure

11:51.760 --> 12:00.760
While a natural language has rich ambiguities and complexities, programming languages are designed to be parsed and imprinted deterministically

12:01.760 --> 12:10.760
Emacs has been used for programming very effectively, so there is a potential for using Emacs as a tool for annotation

12:10.760 --> 12:14.760
This would significantly improve our current set of tools

12:15.760 --> 12:26.760
It is important to note that most of the annotation tools that have been developed over the past few decades have relied on graphical indices

12:27.760 --> 12:30.760
Even those used for enumerated textual indices

12:30.760 --> 12:42.760
Most of the tools in use are designed for a user to add very specific, very restricted information

12:43.760 --> 12:51.760
It has not really made use of the potential that an editor, rich editing environment like Emacs, can add to the mix

12:51.760 --> 12:59.760
Emacs has long been able to edit and manipulate complex, embedded tree structures dependent in source code

13:00.760 --> 13:05.760
It is difficult to imagine the capabilities that we represent naturally

13:06.760 --> 13:13.760
In fact, it always does that, with features that allow us to quickly navigate sentences and graphs with a few keystrokes

13:13.760 --> 13:20.760
Add various text studies, create overlays to name, etc.

13:21.760 --> 13:33.760
Emacs has built up way too many control units, so we don't have to worry about the complexity of managing more languages

13:34.760 --> 13:41.760
In fact, this is not the first time Emacs has used control linguistic sequences

13:41.760 --> 13:56.760
One of the true moments in language natural language processing was the creation of a newly created syntactic tree for a million word collection of Wall Street articles

13:57.760 --> 14:03.760
This was about 1990, before Java or Oracle interfaces were common

14:03.760 --> 14:13.760
The tool that was used to create that corpus was Emacs, and it was created by Penn, known as the Penn Treebank

14:14.760 --> 14:26.760
And in 1992, about when the Linguistic Consortium was established, it's been about 30 years that it has been creating various language related resources

14:26.760 --> 14:36.760
The first outlining mode, in particular the outlining mode, rather enhanced the outlining mode

14:37.760 --> 14:49.760
Allows us to create red outlines, attaching properties to nodes, and why this command is really customizing some of the various pieces of information as per one's requirement

14:49.760 --> 14:56.760
This is also a very useful tool

14:57.760 --> 15:03.760
This enhanced outlining mode provides more power to Emacs

15:04.760 --> 15:13.760
It provides command for easily customizing, entering information, and at the same time hiding unnecessary context

15:13.760 --> 15:25.760
It allows control editing, this could be a very useful tool when we are focused on a limited amount of data

15:26.760 --> 15:37.760
The tool together allows us to create a rich representation that can simultaneously capture multiple possible sequences

15:37.760 --> 15:42.760
Capture details necessary to read the original sources

15:43.760 --> 15:52.760
Allows us to create hierarchical representation, wide structural capabilities that can take advantage of the concept of editance within the tree structure

15:53.760 --> 16:00.760
Together allow local manipulance structure, thereby minimizing data coupling

16:00.760 --> 16:06.760
The concept of tag outlining mode complements the hierarchy pattern

16:07.760 --> 16:12.760
Hierarchies can be very rigid, but through tags on hierarchies we can have multi-faceted representations

16:13.760 --> 16:19.760
As a matter of fact, outlining mode has the ability for tags to do their own hierarchical structure

16:20.760 --> 16:23.760
Further enhances the representational power

16:23.760 --> 16:29.760
All of this can be done as sequence, mostly for functional transformation

16:30.760 --> 16:36.760
Because most capabilities can be configured and customized, it is not necessary to do everything at once

16:37.760 --> 16:41.760
It allows us to intervene in the complexity of the representation

16:42.760 --> 16:46.760
Finally, all of this can be done in plain tag representation

16:47.760 --> 16:50.760
It has its own advantages

16:50.760 --> 16:55.760
Now let's look at a simple example

16:56.760 --> 17:02.760
The sentence is the thought of the moon with a telescope

17:03.760 --> 17:07.760
Let's just make a view of the sentence

17:07.760 --> 17:19.760
What is interesting is that it has a noun phrase i followed by an arrow to star

17:20.760 --> 17:27.760
Then the moon is another phrase and the telescope is a positional phrase

17:27.760 --> 17:42.760
Now, one thing you might remember from grammar school syntax is that there is a syntactical structure

17:42.760 --> 17:57.760
And in this particular case

18:12.760 --> 18:23.760
Because we know that the moon is not something that can hold the telescope

18:24.760 --> 18:37.760
That seeing must be by me, or by eye, and the telescope must be in my hand, or I am viewing the moon with a telescope

18:37.760 --> 18:48.760
However, it is possible that in a different context, the moon could be referred to an animated picture

18:48.760 --> 19:07.760
And could hold the telescope, in that case, the situation might be that I am actually seeing a moon holding a telescope

19:07.760 --> 19:24.760
And this is one of the most complex linguistic phenomena that requires world knowledge

19:24.760 --> 19:38.760
And it is called the P-attachment problem, where the positional phrases can be ambiguous and various different cues have to be used to reduce the ambiguity

19:39.760 --> 19:45.760
So in this case, as you saw, both the readings are technically true depending on the context

19:45.760 --> 19:55.760
So one thing we could do is cut the tree and duplicate it, and then create another node and call it another node

19:56.760 --> 20:02.760
And because this is one of the two interpreters, let's call one division A

20:02.760 --> 20:15.760
And that division essentially is a tile of zone A, and it says that the moon is holding the telescope

20:15.760 --> 20:31.760
Now we create another representation, where we capture the other interpretation, where the moon, or I am holding the telescope

20:32.760 --> 20:38.760
Sorry everyone to interrupt the audio here. Sameer, can you move your mouse a little bit? Just move it to the corner of your screen, thank you

20:38.760 --> 20:49.760
Now we have two separate interpreters in the same structure, and all we have to do is very quickly add a few keystrokes

20:49.760 --> 21:08.760
Now let's add another interesting thing. This is two different interpreters. It can be A, it can be B, it can be C, it can be D, or it can be D

21:08.760 --> 21:23.760
Basically, any entity that has the ability to see can be substituted in this particular node

21:23.760 --> 21:37.760
And let's see what we have here. Now we are just getting a zoom view of the entire structure we have created

21:37.760 --> 21:52.760
Essentially, you can see that by just using a few keystrokes, we are able to capture two different interpretations of a simple sentence

21:52.760 --> 22:06.760
And we are also able to add various alternate pieces of information that could help machine algorithms generalize better

22:06.760 --> 22:25.760
Now let's go to the next thing. In a sense, we can use the power of functional constructors to represent very potentially conflicting and structured readings

22:25.760 --> 22:38.760
In addition to this, we can also create a text with different structure and have them in the same place. This allows us to address the interpretation of certain sentences that may be occurring in the world

22:38.760 --> 23:03.760
While simultaneously giving information that can be more valuable. This makes the enrichment process all very efficient. Additionally, we can enrich the power of users of the feature or button who can not only expand, but also add information into it

23:03.760 --> 23:19.760
In a way, that could help machine algorithms generalize better by making efficient use of their functions. Together, UX and Ardmo can speed the enrichment of nodes in a way that allows us to focus on certain aspects and ignore others

23:20.760 --> 23:28.760
Extremely complex landscape structures can be captured consistently in a function that allows computers to understand the language

23:28.760 --> 23:35.760
We can then use tools to enhance the tests that we do in our everyday life

23:36.760 --> 23:49.760
However, this is the acronym or the type of specification that we are creating to capture this new and present virtual adaptation

23:49.760 --> 24:06.760
We will now look at an example of spontaneous speech that occurs in spoken conversations. Conversations consistently contain interest in speech, interrupts, disfluency, verbal nouns such as talk or laugh, and other noises

24:06.760 --> 24:22.760
Since spontaneous speech is simply a functional stream, we cannot take back words that come out of our mouths. We tend to make mistakes and correct ourselves as soon as we realize that we have spoken

24:22.760 --> 24:35.760
This process manifests through a combination of a handful of mechanisms, including immediate action after an error, and we do this unconsciously

24:35.760 --> 24:51.760
What we've taught here is an example of a language that has various aspects of the representation

24:51.760 --> 25:14.760
We don't have time to go through many of the details. I would highly encourage you to play. I'm making some videos for ASCII cinemas that I'll be posting and if you're interested you can go through those

25:14.760 --> 25:33.760
The idea here is to try a slightly more complex use case, but given the time consumption and the amount of information that can fit in the screen, this should be very informative

25:33.760 --> 25:46.760
But at least you'll see some idea of what can be followed. In this particular case, you're saying that there's a sense which is what I am telling now

25:47.760 --> 25:59.760
Essentially, there is a repetition of the I am, then there is a proper word, nobody can try to say the same thing but start by saying true, and then correct themselves by telling now

25:59.760 --> 26:13.760
So in this case, we can capture a sequence of words

26:13.760 --> 26:30.760
The interesting thing is that in NLB, sometimes we have to typically use words that have this interpretation of the context of I am

26:30.760 --> 26:55.760
You can see that here, this view shows that with each of the words in the sentence or in the representation, you can have a lot of different properties that can attach to them

26:55.760 --> 27:07.760
And these properties are typical then, like in the earlier slide, but you can use the cues of all these properties to various kind of searches and filtering

27:07.760 --> 27:27.760
And the slide here is actually not a legitimate text, on the right are descriptions of what each of these present. This information is also available in the article and you can see there

27:27.760 --> 27:38.760
But it shows how rich a context you can capture, it's just a closer snapshot of the properties on the word

27:39.760 --> 27:50.760
And you can see we can have like whether the word is broken or not, it's incomplete, whether some words want to be filtered for parsing, and say this is ignored

27:50.760 --> 28:00.760
Or some words are restart marks, we can add a restart marker, sometimes some of these migrations

28:01.760 --> 28:10.760
The other fascinating thing about this presentation is that you can edit properties in the content view

28:10.760 --> 28:20.760
So you have this pillar data structure and combining hierarchical data structure, as you can see, you may not be able to see here

28:20.760 --> 28:48.760
What has also happened here is that some of the tags have been inherited from earlier groups, and so you get a much better picture of things, and essentially you can filter out things that you want to access, access them, and then integrate it into the model

28:48.760 --> 29:04.760
So in conclusion today we have posed and implemented the use of the architecture layout, which allows representation, manipulation, and recognition of rich linguistic structure in systematic fashion

29:05.760 --> 29:15.760
We've shown how Google advances tools available for building machine learning models to simulate understanding

29:15.760 --> 29:22.760
Thank you Verj for your attention and contact information on this slide

29:23.760 --> 29:41.760
If you are interested in an additional sample to demonstrate the representation of speech and retext together, continue, otherwise we'll stop here

29:41.760 --> 29:46.760
Is it okay to stop?

29:47.760 --> 29:53.760
Yes Amir, it's okay to stop now. Thank you so much for your talk. Are you able to see the pad on your end?

29:54.760 --> 29:58.760
In the etherpad?

29:59.760 --> 30:01.760
Yes, in the etherpad, do you have the link?

30:02.760 --> 30:06.760
I'm there, nothing has happened so far

30:06.760 --> 30:16.760
I'm going to put a link to the pad, give me just a second right now. Colwyn, feel free to interrupt me whenever you're here

30:17.760 --> 30:23.760
I'm actually looking at the pad, I don't think anything is added in

30:23.760 --> 30:37.760
There don't seem to be questions yet, yes. It's probably because of the audio problem people might have a little bit of trouble hearing you talk

30:38.760 --> 30:46.760
Do you have anything else you'd like to add on your talk maybe? Because I think it was an excruciating process to get it out to us

30:47.760 --> 30:51.760
You had to get a lot of darlings in the process didn't you?

30:51.760 --> 30:59.760
Yeah, in the process of preparing this talk you had to select a lot of stuff that you wanted to include in your talk

31:00.760 --> 31:07.760
Can I ask you to put on your webcam or something? Are you able to do this?

31:07.760 --> 31:25.760
I'm starting to see a few questions come in. Just let us know when you're ready

31:26.760 --> 31:31.760
Colwyn, I'll let you take over. Can you hear me?

31:31.760 --> 31:39.760
Yeah, I hear you, the audio is just a little bit choppy, but we'll just talk slowly and hopefully that will work fine

31:40.760 --> 31:51.760
Well thanks for the great talk, that was just kind of mind blowing actually, I'm looking forward to re-watching it probably two or three times

31:52.760 --> 31:54.760
Who is this?

31:55.760 --> 31:57.760
This is Colwyn again

31:57.760 --> 32:04.760
Okay, so we do have a few questions coming in

32:05.760 --> 32:08.760
I'm going to answer them

32:09.760 --> 32:14.760
Okay, well I can read them to you and then we'll transcribe your answers if you'd like to answer them live

32:14.760 --> 32:29.760
Oh, I see, let me do that. The identity you've come up as the pantry, that has been to depth and people are putting out perfect scores on that

32:30.760 --> 32:35.760
But that's not quite the point, I mean sometimes

32:36.760 --> 32:39.760
Oh, I should also speak slowly

32:39.760 --> 32:54.760
Sometimes the research community goes too far and reuses the evaluations and doesn't really transfer to domains

32:54.760 --> 33:14.760
But our richer and newer data that are available is always, we're in the process, I am currently and a couple of my colleagues, we're getting new data so that we can actually make sure the learning model is better

33:14.760 --> 33:35.760
Oh shoot, and then I failed to unmute myself on the stream here

33:35.760 --> 33:43.760
And I think you're answering in text right now one of these, so I'll just let you drive

33:44.760 --> 33:51.760
So one thing I'll add is, please read the question that you're answering when you read out your answers

33:52.760 --> 33:55.760
Oh, I see, yes

33:55.760 --> 34:06.760
And we're showing the pad on the stream so people are seeing the text and that's probably a good approach considering we're having a little shakiness with the audio

34:25.760 --> 34:35.760
In fact, I think my audio may be pretty stable, so I'll just start reading out both the questions and the answers

34:36.760 --> 34:42.760
But Samir, if you want to, you're welcome to interrupt me if you want to expand on your remarks at all

34:43.760 --> 34:52.760
So the first question was, has the 92U pin corpus of articles feat been reproduced over and over again using these tools

34:52.760 --> 35:01.760
The answer was not quite, that was sort of a first wave, the particular corpus was the first one that started a revolution, kind of

35:02.760 --> 35:13.760
But there are more corpus being made available, in fact I spent about 8 years, a decade ago, building a much larger corpus with more layers of information

35:13.760 --> 35:27.760
And it is called the Onto Notes, it covers Chinese and Arabic, DARPA funded, this is freely available for research to anyone, anywhere

35:28.760 --> 35:32.760
That was quite a feature, quite a feat

35:32.760 --> 35:45.760
The next question, is this only for natural languages like English or more general, would this be used for programming languages

35:46.760 --> 35:54.760
Samir said, I am using English as a use case, but the idea is to have it completely multilingual

35:54.760 --> 36:12.760
I cannot think why you would want to use it for programming languages, in fact the AST in programming languages is sort of what we are trying to build upon

36:12.760 --> 36:29.760
So that one can capture the abstract representation and help the models learn better

36:29.760 --> 36:49.760
These days the models are trained on a boatload of data, and so they tend to be overfitted to the data

36:49.760 --> 37:13.760
So if you have a smaller data set, which is not quite the same as the one that you had the training data for, then the models really do poorly

37:13.760 --> 37:29.760
It is sometimes compared to learning the sine function, using the points on the sine wave, as opposed to deriving the function itself

37:29.760 --> 37:46.760
You can get close, but then you cannot really do a lot better with that model

37:47.760 --> 37:56.760
This is sort of what is happening with the deep learning hype

37:56.760 --> 38:13.760
It is not to say that there hasn't been a significant advancement in the tech, in the technologies

38:13.760 --> 38:28.760
But to say that the models can learn is an extreme overstatement

38:28.760 --> 38:46.760
Awesome answer. I'm going to scroll my copy of the pad down just a little bit, and we'll just take a moment to start looking at the next question

38:46.760 --> 38:57.760
So I'll read that out. Reminds me of the advantages of pre-computer copy and paste, cut up paper and rearrange, but having more stuff with your pieces

38:58.760 --> 39:11.760
Right. Kind of like that, but more intelligent than copy-paste, because you could have various local constraints that would ensure the information is consistent with the whole

39:11.760 --> 39:30.760
I am also envisioning this as a use case of hooks

39:30.760 --> 39:57.760
And if you can have rich local dependencies, then you can be sure, as much as you can, that the information signal is not too corrupted

39:57.760 --> 40:22.760
Have you used it on real life situations? No. I am probably the only person who is doing this crazy thing

40:22.760 --> 40:47.760
It would be nice, or rather, I have a feeling that something like this, if worked upon for a while, by many people, by many, might lead to a really really potent tool for the masses

40:47.760 --> 41:00.760
I feel strongly about using, sorry, I feel strongly about giving such power to the users

41:00.760 --> 41:17.760
And be able to edit and share the data openly, so that they are not stuck in some corporate vault somewhere

41:17.760 --> 41:31.760
Amen. One thing at a time. Plus one for that as well.

41:31.760 --> 41:47.760
Alright, and I will read out the next question. Do you see this as a format for this type of annotation specifically, or something more general that can be used for interlinear glosses, lexicons, etc?

41:47.760 --> 42:12.760
Absolutely. In fact, the project I mentioned, One Notes, has multiple layers of annotation, one of them being the propositional structure, which it uses for a large lexicon that covers about 15k verbs, so 15,000 verbs, and nouns

42:12.760 --> 42:25.760
and all their argument structures that we have been seeing so far in the corpora

42:26.760 --> 42:35.760
This is about a million propositions that have been released recently

42:35.760 --> 42:57.760
We just recently celebrated a 20th birthday of the Corpus. It is called the Prop Bank.

42:57.760 --> 43:19.760
There is an interesting history of the banks. It started with Tree Bank, and then there was Prop Bank, with a capital B

43:19.760 --> 43:47.760
But then, when we were developing Onto Notes, which contains syntax, named entities, conference resolution, propositions, word sense, all in the same hole

43:47.760 --> 43:57.760
Sorry for the interruption. We have about 5 minutes and 15 seconds.

43:58.760 --> 44:09.760
That sounds good. If you want to just read it out, then. I think that would be the most important thing, that people can hear your answers, and I and the other volunteers will be going through and trying to transcribe this.

44:09.760 --> 44:19.760
So go for it.

44:20.760 --> 44:25.760
So, Samuel, just to make sure, did you have something to say right now?

44:25.760 --> 44:39.760
Oh, okay. I think these are all good questions, and there is a lot of it, and clearly the amount of time is not enough.

44:39.760 --> 44:54.760
But I am trying to figure out how to have a community that can help such a person.

44:54.760 --> 45:12.760
One of the things that I am thinking that this could make possible is to take all the disparate resources that have inconsistent or not quite compatible additions on them,

45:12.760 --> 45:34.760
and which are right now just iso of data, small island of data floating in the sea. But representation could really bring them all together, and then they could be much richer, full, and consistent.

45:34.760 --> 45:46.760
Like you said, one of you was asking about the islands and the subcorporas that have sentiment and information.

45:46.760 --> 46:09.760
I am, yeah, there's a lot of various. Common people, the way it could be used for common people is to potentially make them available that currently doesn't recognize the current models on dual land,

46:09.760 --> 46:19.760
so that more people can use the data and not be biased towards one or the other.

46:19.760 --> 46:42.760
And there are some things, when people train these models using huge amounts of data, no matter how big the data is, it is a small cross-section of the universe of data, and depending on what drop select will be your model, those will be the seconds for those.

46:42.760 --> 46:56.760
And some people will be interested in using them on purpose X, but somebody else might want to use them on purpose Y, and if the data is not in, then it's harder to do that.

47:00.760 --> 47:09.760
Okay, so I think we've got just about 100 seconds left, so if you have any closing remarks you want to share, and then we'll start transitioning.

47:09.760 --> 47:17.760
Thank you so much, I really appreciate, this was a great experience, frankly.

47:17.760 --> 47:46.760
I've never had a complete pre-related level of talk before, I guess, in a way it was for a different audience. It was extremely helpful, and I learned that planning sort of tried to create a community.

47:47.760 --> 47:52.760
Thank you so much.

47:53.760 --> 47:58.760
I'll take it over, we are going to move to the next talk. Thank you so much, Samir, and sorry for the technical difficulty.

47:58.760 --> 48:23.760
As Corbin said, we will try to manage as much of the information that was shared during this Q&A, we will file everything away where we can use it, and make captions and all this, so don't worry about the difficulty.

48:28.760 --> 48:29.760
Thank you.