1)
Message boards :
Number crunching :
How many cores are required?
(Message 721)
Posted 1 hour ago by Demis Post: The reasons may be different and there are many of them. For example, let's start with the simplest question: When you launched the boinc-client, how many tasks were installed in the boinc-client for simultaneous calculation? |
2)
Questions and Answers :
Server any other problems :
Some data has been corrected
(Message 719)
Posted 1 day ago by Demis Post: Next step. Duplicates. As my data analytics showed: Duplicates are divided into two large groups: 1. Duplicates at the physical level. 2. Duplicates at the logical level. And in point 2. The values from step 1 are absolutely always included. Now the duplicates at the physical level have been eliminated. |
3)
Questions and Answers :
Server any other problems :
Some data has been corrected
(Message 716)
Posted 8 days ago by Demis Post: Two tasks out of 162 are scheduled for re-counting today for crunchers. 160 tasks now is reassigned for recalculation. |
4)
Questions and Answers :
Server any other problems :
Some data has been corrected
(Message 714)
Posted 10 days ago by Demis Post: Two tasks out of 162 are scheduled for re-counting today for crunchers. The first of two has been received. https://boinc.termit.me/adsl/workunit.php?wuid=15413 The check showed that the result is now correct https://boinc.termit.me/adsl/spt_explore.php?spt=16&s=4687939591477390991 https://boinc.termit.me/adsl/spt_explore.php?spt=16&s=4687939755461166661 https://boinc.termit.me/adsl/spt_explore.php?spt=16&s=4687939864673207491 https://boinc.termit.me/adsl/spt_explore.php?spt=16&s=4687941031016593387 https://boinc.termit.me/adsl/spt_explore.php?tpt=14&s=4687939512514954517 https://boinc.termit.me/adsl/spt_explore.php?stpt=10&s=4687940599903351247 https://boinc.termit.me/adsl/spt_explore.php?stpt=10&s=4687940605722467279 https://boinc.termit.me/adsl/spt_explore.php?stpt=10&s=4687940688488835077 The mechanism for searching and reassigning problematic answers is now clear. Work in this direction continues... |
5)
Questions and Answers :
Server any other problems :
Some data has been corrected
(Message 713)
Posted 16 days ago by Demis Post: Two tasks out of 162 are scheduled for re-counting today for crunchers.There errors in the data database that arose due to hardware errors in the crunchers. If these tasks do not have problems, then the remaining 160 tasks will also be reassigned for recalculation. |
6)
Questions and Answers :
Automatically generated server job :
SPT task
(Message 712)
Posted 17 days ago by Demis Post: Batch 138: 10160928384525935453 begin: 10160928379391935453, end: 10160928389659935453 Count: 1 Make overlap from -5134000000 and +5134000000 . This is special wu created for overlap_135_137 |
7)
Questions and Answers :
Automatically generated server job :
SPT task
(Message 711)
Posted 17 days ago by Demis Post: Batch 137: 9911328384525935453 .. 10160928384525935453 -1 Count: 128000 Continue from 9,91E+18 |
8)
Questions and Answers :
Server any other problems :
Some data has been corrected
(Message 709)
Posted 18 days ago by Demis Post: There errors in the data database that arose due to hardware errors in the crunchers. As of now we have 888 incorrect answers from crunchers. All of these incorrect answers come from 162 workunits. Total number of workunits issued: 2 691 776 Total number of values received: 16 867 165 2691776 = 100% 162 = X x=0,0060183313916165386718657124515561% 16867165 = 100% 888 = X x=0,0052646665874199961878596669920523% These are just current error statistics. |
9)
Message boards :
Number crunching :
Problem with validation.
(Message 708)
Posted 21 days ago by Demis Post: ... This issue has been clearly identified and resolved. |
10)
Questions and Answers :
Server any other problems :
Some data has been corrected
(Message 707)
Posted 25 days ago by Demis Post: И наконец. Здравый смысл... |
11)
Questions and Answers :
Server any other problems :
Some data has been corrected
(Message 706)
Posted 25 days ago by Demis Post: Пересмотрите снова. https://boinc.termit.me/adsl/forum_thread.php?id=67&postid=696#696
Я показываю в формате как хранится и извлекается сервером. Чтобы исключить не верность пересчета в банальном. Мне так проще. Неужели это не понятно? Как писал выше - только пересчетом. Т.е. такое задание было пересчитано программой spt которая есть у каждого кранчера. И таки - да. Это занимает два часа времени. На том компьютере где это у меня считалось. Да Пересчитал. Пожалуйста: 5499120046153320487: [0, 54, 84, 94, 96, 130, 150, 172, 174, 196, 216, 250, 252, 262, 292, 346] 5499120251551369451: [0, 30, 50, 126, 162, 182, 192, 242, 246, 296, 306, 326, 362, 438, 458, 488] 5499120773581271527: [0, 22, 40, 70, 82, 132, 150, 202, 210, 262, 280, 330, 342, 372, 390, 412] 5499121289947186217: [0, 44, 50, 66, 110, 134, 140, 156, 164, 180, 186, 210, 254, 270, 276, 320] 5499121372440344689: [0, 34, 52, 54, 108, 154, 258, 264, 358, 364, 468, 514, 568, 570, 588, 622] 5499120954814009877: [0, 2, 72, 74, 132, 134, 144, 146, 222, 224, 342, 344, 384, 386] 5499121634173665539: [0, 2, 18, 20, 30, 32, 42, 44, 60, 62] Но перепроверяйте сами, т.к. делал вручную. Да. Показал выше.
5499119934525935453..5499121884525935453 (step:1950000000000) |
12)
Questions and Answers :
Server any other problems :
Some data has been corrected
(Message 702)
Posted 25 days ago by Demis Post: Вот, например, у вас (это то, что можно как-то воспринять)Да. Но уточню, что оно и все последующие за ним Правильно. Предполагаю, что да. Не похоже. Там список строк начинающихся со слов "tuple find:" Плюс счетчик всех этих строк "count:" Это вообще не понятное число, в том смысле, что неизвестно откуда оно появилось. Его там быть не должно. Это и показано в моем посте. Смотрите внимательно k=16 или я не правильно Вас понял.
Боюсь, что нет. Вариантов, причин происходящего, было рассмотрено большое количество. Но достоверного ответа - нет ни одного. Ошибка у кранчеров возникает на разных компьютерах, разных пользователей, разном железе. Но очень не часто. Закономерностей не обнаружено. Всего 0.02% из более 15 миллионов ответов. У меня был алгоритм как найти плохие ответы. Но сейчас он утерян. (Ноутбук сломался еще в январе). Когда находил список "проблемных" решений, оставалось его только пересчитать локально. Чтобы перепроверить, что это действительно не правильные данные от кранчера. Правильный я Вам сразу привел, чтобы видно было и можно было сравнить. Т.е. "что получено" от кранчера и "как должно быть". Пересмотрите снова. Только пересчитывать. (Локально или через кранчеров - это уже частности.) Именно! Но Вам-же это не интересно было? Тема была поднята в моем письме от 12-го августа. Только когда есть время. Это не от меня зависит.
Да. Мной неоднократно писалось, что есть более важные задачи. И это одна из них. И также я писал, что работа продолжается. Есть разные соображения, что с этим делать. Но они пока не оформлены в коде. |
13)
Questions and Answers :
Server any other problems :
Some data has been corrected
(Message 696)
Posted 26 days ago by Demis Post: Example BAD data: Read data from file 'wu_431428_803879_spt_101_5499119934525935453_1_366_output.dat' : ident:5499119934525935453 start:5499119934525935453 chkpt:5499121884525935473 last:5499121884525936859 step (last-start):1950000001406 step (chkpt-start):1950000000020 nprime: 2338848025 status: 1 status2: 2 sieve_init_cs: 208 twin_gap_d: 886 twin_gap_6d: 400 data: tuple find: 5499120046153320487 k=16 kind=0 (spt) deriv=0 ofs=54 30 10 2 34 20 22 2 tuple find: 5499120251551369451 k=16 kind=0 (spt) deriv=0 ofs=30 20 76 36 20 10 50 4 tuple find: 5499120773905876457 k=16 kind=0 (spt) deriv=0 ofs=2 24 18 46 24 18 30 32 tuple find: 5499120970980195589 k=16 kind=0 (spt) deriv=0 ofs=10 20 4 18 12 48 2 34 tuple find: 5499121126535252239 k=13 kind=0 (spt) deriv=0 ofs=24 6 60 42 18 12 tuple find: 5499121316841673511 k=16 kind=0 (spt) deriv=0 ofs=2 24 24 28 24 56 4 14 tuple find: 5499121483035391399 k=16 kind=0 (spt) deriv=0 ofs=40 8 22 12 8 22 48 32 tuple find: 5499121558853990699 k=16 kind=0 (spt) deriv=0 ofs=14 16 2 42 6 10 54 86 tuple find: 5499121733722103317 k=16 kind=0 (spt) deriv=0 ofs=42 10 44 34 2 58 42 2 tuple find: 5499121775895727829 k=16 kind=0 (spt) deriv=0 ofs=12 56 46 116 10 8 34 8 tuple find: 5499121234826549117 k=10 kind=1 (stpt) deriv=0 ofs=2 10 2 58 2 tuple find: 5499121440577613711 k=10 kind=1 (stpt) deriv=0 ofs=2 34 2 40 2 tuple find: 5499121475147240399 k=10 kind=1 (stpt) deriv=0 ofs=2 16 2 10 2 tuple find: 5499121591027137257 k=10 kind=1 (stpt) deriv=0 ofs=2 28 2 28 2 tuple find: 5499121666242136481 k=10 kind=1 (stpt) deriv=0 ofs=2 4 2 40 2 tuple find: 5499121680143694047 k=10 kind=1 (stpt) deriv=0 ofs=2 28 2 10 2 tuple find: 5499121740427855217 k=10 kind=1 (stpt) deriv=0 ofs=2 10 2 28 2 tuple find: 5499121817057916077 k=10 kind=1 (stpt) deriv=0 ofs=2 28 2 28 2 end data. primes.empty() = 0 count: 18 Done. All binary data fields is correct. Do not have destroyed nothing. But correct data (for these task) is: Read data from file 'output_101_5499119934525935453-manual.dat' : ident:5499119934525935453 start:5499119934525935453 chkpt:5499121884525935473 last:5499121884525936859 step (last-start):1950000001406 step (chkpt-start):1950000000020 nprime: 2240302156 status: 1 status2: 2 sieve_init_ms: 4080 (4 sec) twin_gap_d: 886 twin_gap_6d: 400 data: tuple find: 5499120046153320487 k=16 kind=0 (spt) deriv=0 ofs=54 30 10 2 34 20 22 2 tuple find: 5499120251551369451 k=16 kind=0 (spt) deriv=0 ofs=30 20 76 36 20 10 50 4 tuple find: 5499120773581271527 k=16 kind=0 (spt) deriv=0 ofs=22 18 30 12 50 18 52 8 tuple find: 5499121289947186217 k=16 kind=0 (spt) deriv=0 ofs=44 6 16 44 24 6 16 8 tuple find: 5499121372440344689 k=16 kind=0 (spt) deriv=0 ofs=34 18 2 54 46 104 6 94 tuple find: 5499120954814009877 k=14 kind=2 (tpt) deriv=0 ofs=70 58 10 76 118 40 tuple find: 5499121634173665539 k=10 kind=1 (stpt) deriv=0 ofs=2 16 2 10 2 end data. primes.empty() = 0 count: 7 Done. We see: tuple find: 5499120046153320487 k=16 kind=0 (spt) deriv=0 ofs=54 30 10 2 34 20 22 2 tuple find: 5499120251551369451 k=16 kind=0 (spt) deriv=0 ofs=30 20 76 36 20 10 50 4 tuple find: 5499120773905876457 k=16 kind=0 (spt) deriv=0 ofs=2 24 18 46 24 18 30 32 ... ... and tuple find: 5499120046153320487 k=16 kind=0 (spt) deriv=0 ofs=54 30 10 2 34 20 22 2 tuple find: 5499120251551369451 k=16 kind=0 (spt) deriv=0 ofs=30 20 76 36 20 10 50 4 tuple find: 5499120773581271527 k=16 kind=0 (spt) deriv=0 ofs=22 18 30 12 50 18 52 8 ... ... After line 2 the data is incorrect. The number of entries in the list is also different. |
14)
Questions and Answers :
Server any other problems :
Some data has been corrected
(Message 694)
Posted 26 days ago by Demis Post: Yes. Quorum 2 eliminates this problem. |
15)
Questions and Answers :
Server any other problems :
Some data has been corrected
(Message 692)
Posted 26 days ago by Demis Post: There was an error in the assimilator source code published by Tomas. Because of this error, some of the correct “ofs” results calculated and sent by crunchers showed incorrect start values in the database and, accordingly, on the website too. There were about 50,000 of them. There errors in the data database that arose due to hardware errors in the crunchers. There are not many of them - about 2-3 thousand (The total number of responses from crunchers is more than 15,000,000). Such problems will be identified, disqualified, and published for re-counting. Work on this continues... It is important to understand that iron errors in crunchers are impossible to predict. They can only be found after a response has been received. |
16)
Questions and Answers :
Server any other problems :
Some data has been corrected
(Message 688)
Posted 27 days ago by Demis Post: Some tasks counted by crunchers went into an error state. However, no credits were assigned for these tasks. Moreover, it is absolutely certain that these tasks were calculated normally. And the answer from them is present in the database. For all such tasks, statistics were recalculated and credits were assigned. The following text has been added to the captions for such tasks: Validation rechecked, correct credit, calculated, fixed and assigned. v.1.0(Example: https://boinc.termit.me/adsl/result.php?resultid=2688050) "Batch" - "Count err": 103 - 1 111 - 19 113 - 6 115 - 99 117 - 14 119 - 211 121 - 42 123 - 18 125 - 161 129 - 114 131 - 54 133 - 16 The cause of these errors was resolved last week. |
17)
Questions and Answers :
Server any other problems :
Some data has been corrected
(Message 672)
Posted 15 Apr 2024 by Demis Post: Some data has been corrected today: tpt k=24 - is 0 tuples corrected tpt k=22 - is 12 tuples corrected tpt k=20 - is 123 tuples corrected tpt k=18 - is 2604 tuples corrected tpt k=16 - is 47799 tuples corrected tpt total = 50538 tuples corrected The cause of the errors was in the assimilator. This has now been resolved. Data analysis work will continue. |
18)
Questions and Answers :
Automatically generated server job :
SPT task
(Message 671)
Posted 13 Apr 2024 by Demis Post: Batch 136: 9911328384525935453 begin: 9911328379391935453, end: 9911328389659935453 Count: 1 Make overlap from -5134000000 and +5134000000 . This is special wu created for overlap_133_135 |
19)
Questions and Answers :
Automatically generated server job :
SPT task
(Message 670)
Posted 13 Apr 2024 by Demis Post: Batch 135: 9661728384525935453 .. 9911328384525935453 -1 Count: 128000 Continue from 9,66E+18 |
20)
Message boards :
Number crunching :
Unresponsive computer
(Message 666)
Posted 3 Apr 2024 by Demis Post: Ok. Fine! All that remains is to find the right balance of resources for different projects. This can be achieved through points 1. and 2. FAQ. Since point 1 applies to a specific project (customized by the user in the web form of each project). And point 2 operates in the boinc-client on a specific user’s computer. |
©2024 Natalia Makarova & Alex Belyshev & Tomáš Brada