From readme:
-General
The engine do not longer return TMoves but PMoves (this increases the internal speed)
Some functions inlined
All references to int64 changed to UInt64
-Bitboards
New unit
Some new constans to help
-Board
New PlayFast(), StaticExchangeEvaluation() now uses it
MovesToSquare() changed to PMoves
Flip() implemented again to find symmetric issues
HasPiecesInCol() removed
PawnIsPassed() improved
New endings KBKKK, KRKR, KQKQ, KNKNN, KNNK
GetPieceType(side, type) split in GetXXX(side)
GetPieceCount(side, type) split in GetXXXCount(side)
Started to use bitboards
Backup is not longer a TStack<T> but a simple ping pong array
New functions AsBitboard()
Attacks[] removed, now using a precalculated bitboard per piece
Fixed strange bugs when promoting in PieceMovingToSquare()
FindPiece() now uses IsValidMove() internally which makes the selection of moves more robust
Some functions now return TScore rather than a simple integer
Finally TPiece class replaced by a simple composed integer, this increases the overall speed
New IndicesToPieceLists[] that helps to find out the index in the related piece list.
Indices/Pieces list removed
Play/AddPiece/DeletePiece/UpdateSquare updated in accordance with piece changes
-Book
A new book has been generated
Handle book code rewritten
Book for the moment does not longer uses scores
-Cache
Pawn cache now handles TScore based values
Clear() optimized
-Definitions
Removed TCaptures
Removed PDirArray
Removed Colors[]
Removed NotMine[]
Bishop and knight ending values changed
-EndGame
New ending KBKP (previously the engine was doing a bad job)
-Engine
Evaluate() now handled TScore Evaluation
-Evaluation
All scores use now TScore and only one interpolation is done at the end
Fixed another stupid overflow error in chain panws (-1 when col 1)
TrappedPieces() now handle all kind of trapped pieces
Again another silly error, don't use -f() when f() can return +-, uses always +f()...
EvalConnectedRooks() optimized a bit
Generate attacks optimized and tuned a bit (not finished)
Added piece threats and removed hungs
Added pawn adjustment to bishop eval
Bitboards used in some parts
Changed the passers bonus values
Changed the values of rook open files
Changes the tempo values
Changed the value of bishop pairs
-Killers
Removed backups
-History
Removed backups
-MoveGenerator
King was not using its position score (thanks for the non warning mr. compiler...)
Fixed a symmetric bug in the passed pawn chain
Fixed a symmetric bug in the rook eval B8 = C8
Very much improved with PMoves
New GetMoves() (generation functions do not longer return movements)
Changed the order that moves are generated
GenerateEvassions() now cleans the king moves
Pawns, knights and king fully uses bitboards (sliders next, thinking the best way wiuthout copy-pasting...)
-MovePicker
mcQuiescenseCheck removed, instread the normal mcCheck is used
Improved with PMoves
RemoveNonChecks() removed
Fixed a serious bug when in check and singular is applied
In some cases the Singular phase could return more than one move
Sort functions are executed even with only one move has to be sorted (since they can be discarded)
Class simplified, only GetMove() is public, one must call the function until move = NO_MOVE
First capturing score is not longer based on See() but on MVVLVA()
-Piece
New unit
Evaluate() now returns TScore
Created EvalXXX() so evaluation can call them directly
New class TPieceHelper with all static methods for helping unpack TPiece data
-Pieces
GetColMask() removed
New TPieces.Asbitboard()
-Scores
New unit
-Search
Added some debug defines
PV removed (I can't see improves, most probably because I'm using it wrong...)
Stats only counted in debug mode
Configuration handled better
NextStack() only called one per depth
"Singular" extension removed, silly approach...
History Reduction removed
Reorganized a bit for new picker behaviour
-Utils
Interpolate() removed, stuff goes to TScore
Tuesday, November 29, 2011
Friday, November 25, 2011
Still alive...
Just preparing ChessKISS 1.4, but it takes time...
New version looks promising:
Arena tournament
48 games played / Tournament is finished
Tournament start: 2011.11.25, 08:41:14
Latest update: 2011.11.25, 15:13:49
New version looks promising:
Arena tournament
Rank | Engine | Score | Ch | |
---|---|---|---|---|
1 | ChessKISS | 37,0/48 | · ·· ·· ·· | |
2 | Faile | 3,5/8 | 1=001010 | |
3 | BigLion | 3,0/8 | 00==1001 | |
4 | GERBIL | 2,5/8 | 0=010001 | |
5 | Micromax48_c2d | 2,0/8 | 00101000 | |
6 | ZOIDBERG | 0,0/8 | 00000000 | |
6 | Roce38 | 0,0/8 | 00000000 |
48 games played / Tournament is finished
Tournament start: 2011.11.25, 08:41:14
Latest update: 2011.11.25, 15:13:49
Thursday, October 13, 2011
Reasons why ChessKISS 64 is slower than 32, part 2
While debugging I've found this, quite interesting...
LastPieceCaptured is loaded into RAX, the pointer is valid, thus I'm going to use it, but again the compiler is loading the same value into the same register, what about the NOP operation in between?
Search.pas.898: if board.LastPieceCaptured <> nil then //same value and index 0000000000615F3D 488B8338040000 mov rax,[rbx+$00000438] 0000000000615F44 90 nop 0000000000615F45 4885C0 test rax,rax 0000000000615F48 745D jz $0000000000615fa7 Search.pas.900: recaptures[side] := board.LastPieceCaptured.GetMiddleValue + TMoveHelper.GetTo(aMove); 0000000000615F4A 488B8338040000 mov rax,[rbx+$00000438]
LastPieceCaptured is loaded into RAX, the pointer is valid, thus I'm going to use it, but again the compiler is loading the same value into the same register, what about the NOP operation in between?
Testing ChessKISS
I'm doing some changes in order to see if somehow I can improve the performance of the 64 bits version, but this of course needs to be tested, so here we go...
Arena tournament
Second after Mediocre, not bad at all, the engine seems to be behave identically.
Arena tournament
Rank | Engine | Score | Me | Ch | Bi | Mi | Ro | Pr | ZO | Pu | Ts | S-B |
---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | Mediocre | 14,0/16 | · · | 11 | =1 | 11 | 01 | 1= | 11 | 11 | 11 | 102,50 |
2 | ChessKISS | 11,5/16 | 00 | · · | =1 | 01 | 11 | 1= | 1= | 11 | 11 | 73,50 |
3 | BigLion | 9,0/16 | =0 | =0 | · · | 10 | =1 | 10 | 11 | =1 | 01 | 62,00 |
4 | Micromax48_c2d | 9,0/16 | 00 | 10 | 01 | · · | 10 | 11 | 01 | 11 | 01 | 60,50 |
5 | Roce38 | 6,0/16 | 10 | 00 | =0 | 01 | · · | =0 | 01 | == | 10 | 47,00 |
6 | Predateur 2.0 | 6,0/16 | 0= | 0= | 01 | 00 | =1 | · · | 0= | 10 | 10 | 44,25 |
7 | ZOIDBERG | 6,0/16 | 00 | 0= | 00 | 10 | 10 | 1= | · · | 01 | 10 | 40,25 |
8 | Pulsar2009-9b | 5,5/16 | 00 | 00 | =0 | 00 | == | 01 | 10 | · · | 11 | 32,50 |
9 | Tscp181 | 5,0/16 | 00 | 00 | 10 | 10 | 01 | 01 | 01 | 00 | · · | 36,00 |
Second after Mediocre, not bad at all, the engine seems to be behave identically.
Wednesday, October 12, 2011
Reasons why ChessKISS 64 is slower than 32
Optimized assembler in 32 bits mode:
Optimized assembler in 64 bits mode:
Both versions suffer from the same issue, loading into the same register the same value over and over (in EAX or RAX the compiler is loading the data variable), how can this still happening?
Board.pas.1234: data.EnPassant := nil; 005412BF 8B45C4 mov eax,[ebp-$3c] 005412C2 33D2 xor edx,edx 005412C4 899008020000 mov [eax+$00000208],edx Board.pas.1237: data.LastPieceCaptured := nil; 005412CA 8B45C4 mov eax,[ebp-$3c] 005412CD 33D2 xor edx,edx 005412CF 899024020000 mov [eax+$00000224],edx Board.pas.1238: data.LastPieceMoved := nil; 005412D5 8B45C4 mov eax,[ebp-$3c] 005412D8 33D2 xor edx,edx 005412DA 899020020000 mov [eax+$00000220],edx Board.pas.1239: data.LastMove := aMove; 005412E0 8B45C4 mov eax,[ebp-$3c] 005412E3 89B828020000 mov [eax+$00000228],edi Board.pas.1240: piece := nil; 005412E9 33F6 xor esi,esi
Optimized assembler in 64 bits mode:
Board.pas.1234: data.EnPassant := nil; 00000000005E423F 488B8518030000 mov rax,[rbp+$00000318] 00000000005E4246 48C7800804000000000000 mov qword ptr [rax+$00000408],$0000000000000000 Board.pas.1237: data.LastPieceCaptured := nil; 00000000005E4251 488B8518030000 mov rax,[rbp+$00000318] 00000000005E4258 48C7803004000000000000 mov qword ptr [rax+$00000430],$0000000000000000 Board.pas.1238: data.LastPieceMoved := nil; 00000000005E4263 488B8518030000 mov rax,[rbp+$00000318] 00000000005E426A 48C7802804000000000000 mov qword ptr [rax+$00000428],$0000000000000000 Board.pas.1239: data.LastMove := aMove; 00000000005E4275 488B8518030000 mov rax,[rbp+$00000318] 00000000005E427C 8B8D50030000 mov ecx,[rbp+$00000350] 00000000005E4282 898838040000 mov [rax+$00000438],ecx
Both versions suffer from the same issue, loading into the same register the same value over and over (in EAX or RAX the compiler is loading the data variable), how can this still happening?
Thursday, October 6, 2011
Creating chaining tasks
In the unit BB.Task there is a class called TTask that helps creating task, it exposes this interface:
Like this is really easy to create task based on plain methods, other tasks or closures, let's see an easy example:
We are creating task A that will execute method one, after that, method ONE will execute method TWO and after that this task will launch the closure, at the end the closure will call the Task B that is executing method THREE. Is also possible to send messages and objects among different task.
Quite handy.
ITask = interface procedure AddParam(const aName, aValue: string); procedure AttachObject(aObject: TObject); function Available: boolean; procedure ClearParams; procedure ContinueWith(aTask: ITask); overload; function ContinueWith(aProc: TProc): ITask; overload; function ContinueWith(aEvent: TTaskEvent): ITask; overload; function GetAttachedTask: ITask; function GetAttachedObject: TObject; function GetEvent: TTaskEvent; function GetProc: TProc; function GetExceptionObject: TObject; function GetExceptionEvent: TExceptionEvent; function GetLock: ILock; function GetName: string; function GetParam(const aName: string): string; function GetTerminateEvent: TNotifyEvent; function GetAttachedObjectBehaviour: TAttachedObjectBehaviour; procedure Run; procedure Stop; procedure SendMessage(aTask: ITask; aMessage: TMessage); overload; procedure SendMessage(aTask: ITask; const aText: string); overload; procedure SetLock(aLock: ILock); procedure SetAttachedObjectBehaviour(aValue: TAttachedObjectBehaviour); procedure ReceiveMessage(aMessage: TMessage); procedure SetPriority(aValue: TThreadPriority); function Terminated: boolean; function Wait(aTime: cardinal): boolean; end;
Like this is really easy to create task based on plain methods, other tasks or closures, let's see an easy example:
unit Unit52; interface uses Winapi.Windows, Winapi.Messages, System.SysUtils, System.Variants, System.Classes, Vcl.Graphics, Vcl.Controls, Vcl.Forms, Vcl.Dialogs, BB.Task; type TForm51 = class(TForm) procedure FormCreate(Sender: TObject); private procedure One; procedure Two; procedure Three; { Private declarations } public { Public declarations } end; var Form51: TForm51; implementation {$R *.dfm} procedure TForm51.FormCreate(Sender: TObject); var TaskA, TaskB: ITask; begin TaskB := TTask.Create(Three); TaskA := TTask.Create(One); TaskA.ContinueWith(Two).ContinueWith( procedure begin Beep; end ).ContinueWith(TaskB); TaskA.Run; end; procedure TForm51.One; begin Sleep(1000); end; procedure TForm51.Three; begin Sleep(3000); end; procedure TForm51.Two; begin Sleep(2000); end; end.
We are creating task A that will execute method one, after that, method ONE will execute method TWO and after that this task will launch the closure, at the end the closure will call the Task B that is executing method THREE. Is also possible to send messages and objects among different task.
Quite handy.
Monday, October 3, 2011
New class TCountdownEventEx for versions prior to XE2
I've discovered some new handy classes in XE2 and I said to myself why not implement them into prior versions via my own class?
TCountdownEventEx = class private FCurrent, FInitial: integer; FEvent: TEventEx; public constructor Create(aSize: integer); destructor Destroy; override; procedure Add(aSize: integer); function IsFinish: boolean; procedure Reset; procedure Signal; procedure Wait(aTimeout: cardinal); property Current: integer read FCurrent; property Initial: integer read FInitial; end; { TCountdownEventEx } procedure TCountdownEventEx.Add(aSize: integer); begin if IsFinish then raise Exception.Create('Already signalled'); TInterlockedEx.Add(FCurrent, aSize); end; constructor TCountdownEventEx.Create(aSize: integer); begin FInitial := aSize; FEvent := TEventEx.Create; Reset; end; destructor TCountdownEventEx.Destroy; begin FEvent.Free; inherited; end; procedure TCountdownEventEx.Reset; begin FEvent.Reset; FCurrent := FInitial; end; procedure TCountdownEventEx.Signal; begin TInterlockedEx.Dec(FCurrent); if FCurrent = 0 then FEvent.Sign; end; function TCountdownEventEx.IsFinish: boolean; begin Exit(FCurrent <= 0); end; procedure TCountdownEventEx.Wait(aTimeout: cardinal); begin FEvent.Wait(aTimeout); end;I've used the same example as in Delphi help and it seems to work...
program TCountdownEvent_example; {$APPTYPE CONSOLE} uses SysUtils, BB.Sync, Classes; var RunwayFlag: TCountdownEventEx; type TThreadIgnite = class(TThread) private procedure Execute; override; end; procedure TThreadIgnite.Execute; begin RunwayFlag.Wait(INFINITE); Writeln('Ignited'); end; var Thread: TThreadIgnite; begin RunwayFlag := TCountdownEventEx.Create(200); Thread := TThreadIgnite.Create(True); Thread.Start; while not RunwayFlag.IsFinish do begin RunwayFlag.Signal; Writeln(RunwayFlag.Current); end; end. { Put breakpoint here to see the console output. }
TInterlocked
Delphi XE2? has out of the box the TInterlocked helper functions, but prior to that there is only the Windows API, what's why I have my own version TInterlockedEx:
It works for 32 bits (with optimized assembler) and also for 64 bits (not so optimized yet...)
{ TInterlockedEx } class function TInterlockedEx.CAS(var aTarget: integer; aCurrentVal, aNewVal: integer): boolean; begin { //Compare, if equal then set and return TRUE if aTarget = aCurrentVal then begin aTarget := aNewVal; result := true; else result := false; } result := InterlockedCompareExchange(aTarget, aNewVal, aCurrentVal) = aCurrentVal; end; class procedure TInterlockedEx.Add(var aValue: integer; aCounter: integer); //EDX,ECX {$IFDEF WIN32} ASM LOCK XADD [EDX], ECX MOV EAX, [EDX] END; {$ELSE} begin _Lock.Lock; try System.Inc(aValue, aCounter); finally _Lock.Unlock; end; end; {$ENDIF} {$IFDEF WIN32} class function TInterlockedEx.CAS(var aTarget: pointer; aCurrentVal, aNewVal: pointer): boolean; begin result := CAS(integer(aTarget), integer(aCurrentVal), integer(aNewVal)); end; class function TInterlockedEx.CAS(var aTarget: TObject; aCurrentVal, aNewVal: TObject): boolean; begin result := CAS(integer(aTarget), integer(aCurrentVal), integer(aNewVal)); end; {$ENDIF} class function TInterlockedEx.Dec(var aValue: integer): integer; begin result := InterlockedDecrement(aValue); end; class function TInterlockedEx.Inc(var aValue: int64): integer; begin if _Lock <> nil then //This is also call by _Lock.Create(), that's why we check first _Lock.Lock; try System.Inc(aValue); result := aValue; finally if _Lock <> nil then _Lock.Unlock; end; end; class function TInterlockedEx.Exchange(var aValA, aValB: integer): integer; begin result := InterlockedExchange(aValA, aValB); end; class procedure TInterlockedEx.Sub(var aValue: integer; aCounter: integer); //EDX,ECX {$IFDEF WIN32} ASM NEG ECX //1-3 = 1+(-3) LOCK XADD [EDX], ECX MOV EAX, [EDX] END; {$ELSE} begin _Lock.Lock; try System.Dec(aValue, aCounter); finally _Lock.Unlock; end; end; {$ENDIF} class function TInterlockedEx.Inc(var aValue: integer): integer; begin result := InterlockedIncrement(aValue); end; class function TInterlockedEx.CAS(var aTarget: LongBool; aCurrentVal, aNewVal: LongBool): boolean; begin result := CAS(integer(aTarget), integer(aCurrentVal), integer(aNewVal)); end; class function TInterlockedEx.CAS(var aTarget: cardinal; aCurrentVal, aNewVal: cardinal): boolean; begin result := CAS(integer(aTarget), integer(aCurrentVal), integer(aNewVal)); end; class function TInterlockedEx.Dec(var aValue: int64): integer; begin _Lock.Lock; try System.Dec(aValue); result := aValue; finally _Lock.Unlock; end; end;
It works for 32 bits (with optimized assembler) and also for 64 bits (not so optimized yet...)
Saturday, October 1, 2011
Monday, September 19, 2011
XE2, strange optimizations when in 64 bits...
Given this silly code:
The optimized 32 bits code looks like:
Quite ok, direct access to the Self pointer located in EAX and decreasing loop.
and now the 64 bits version:
What is this?, why for every single self access a pointer in rdx is loaded?, is that an issue in the x64 assembler code that I'm not aware?, one cannot do add dword prt [rcx+$000065c], 4?, and what about the increasing plus comparing loop?
Dissapointing?, well, if you are able to modify your code to access self variables as less as possible then the optimization works better, let's see with this slightly modification:
and now the assembler code shows:
The code looks much better now although the loop still funny (add + cmp). The thing is that in order to speed up ChessKISS in the 64 bits version, I've to remove as much as possible the use of class fields, which is not an easy task, let's see if I got the energy to do so and it yields the expected results.
type TForm50 = class(TForm) procedure FormCreate(Sender: TObject); private FA, FB, FC, FD: integer; { Private declarations } public { Public declarations } end; var Form50: TForm50; implementation {$R *.dfm} procedure TForm50.FormCreate(Sender: TObject); var i: integer; begin FA := 0; FB := 0; FC := 0; FD := 0; for i := 0 to 999999 do begin Inc(FA); Inc(FB, 2); Inc(FC, 3); Inc(FD, 4); end; end;
The optimized 32 bits code looks like:
Unit50.pas.31: FA := 0; 0050FE5C 33D2 xor edx,edx 0050FE5E 899090030000 mov [eax+$00000390],edx Unit50.pas.32: FB := 0; 0050FE64 33D2 xor edx,edx 0050FE66 899094030000 mov [eax+$00000394],edx Unit50.pas.33: FC := 0; 0050FE6C 33D2 xor edx,edx 0050FE6E 899098030000 mov [eax+$00000398],edx Unit50.pas.34: FD := 0; 0050FE74 33D2 xor edx,edx 0050FE76 89909C030000 mov [eax+$0000039c],edx Unit50.pas.36: for i := 0 to 999999 do 0050FE7C BA40420F00 mov edx,$000f4240 Unit50.pas.38: Inc(FA); 0050FE81 FF8090030000 inc dword ptr [eax+$00000390] Unit50.pas.39: Inc(FB, 2); 0050FE87 83809403000002 add dword ptr [eax+$00000394],$02 Unit50.pas.40: Inc(FC, 3); 0050FE8E 83809803000003 add dword ptr [eax+$00000398],$03 Unit50.pas.41: Inc(FD, 4); 0050FE95 83809C03000004 add dword ptr [eax+$0000039c],$04 Unit50.pas.36: for i := 0 to 999999 do 0050FE9C 4A dec edx 0050FE9D 75E2 jnz $0050fe81
Quite ok, direct access to the Self pointer located in EAX and decreasing loop.
and now the 64 bits version:
Unit50.pas.31: FA := 0; 000000000059C520 C7815006000000000000 mov [rcx+$00000650],$00000000 Unit50.pas.32: FB := 0; 000000000059C52A C7815406000000000000 mov [rcx+$00000654],$00000000 Unit50.pas.33: FC := 0; 000000000059C534 C7815806000000000000 mov [rcx+$00000658],$00000000 Unit50.pas.34: FD := 0; 000000000059C53E C7815C06000000000000 mov [rcx+$0000065c],$00000000 Unit50.pas.36: for i := 0 to 999999 do 000000000059C548 4833C0 xor rax,rax Unit50.pas.38: Inc(FA); 000000000059C54B 488D9150060000 lea rdx,[rcx+$00000650] 000000000059C552 830201 add dword ptr [rdx],$01 Unit50.pas.39: Inc(FB, 2); 000000000059C555 488D9154060000 lea rdx,[rcx+$00000654] 000000000059C55C 830202 add dword ptr [rdx],$02 Unit50.pas.40: Inc(FC, 3); 000000000059C55F 488D9158060000 lea rdx,[rcx+$00000658] 000000000059C566 830203 add dword ptr [rdx],$03 Unit50.pas.41: Inc(FD, 4); 000000000059C569 488D915C060000 lea rdx,[rcx+$0000065c] 000000000059C570 830204 add dword ptr [rdx],$04 Unit50.pas.42: end; 000000000059C573 83C001 add eax,$01 000000000059C576 81F840420F00 cmp eax,$000f4240 000000000059C57C 75CD jnz $000000000059c54b
What is this?, why for every single self access a pointer in rdx is loaded?, is that an issue in the x64 assembler code that I'm not aware?, one cannot do add dword prt [rcx+$000065c], 4?, and what about the increasing plus comparing loop?
Dissapointing?, well, if you are able to modify your code to access self variables as less as possible then the optimization works better, let's see with this slightly modification:
procedure TForm50.FormCreate(Sender: TObject); var a, b, c, d, i: integer; begin a := 0; b := 0; c := 0; d := 0; for i := 0 to 999999 do begin Inc(a); Inc(b, 2); Inc(c, 3); Inc(d, 4); end; FA := a; FB := b; FC := c; FD := d; end;
and now the assembler code shows:
Unit50.pas.32: a := 0; 000000000059C520 4833C0 xor rax,rax Unit50.pas.33: b := 0; 000000000059C523 4833D2 xor rdx,rdx Unit50.pas.34: c := 0; 000000000059C526 4D33C0 xor r8,r8 Unit50.pas.35: d := 0; 000000000059C529 4D33C9 xor r9,r9 Unit50.pas.37: for i := 0 to 999999 do 000000000059C52C 4D33D2 xor r10,r10 Unit50.pas.39: Inc(a); 000000000059C52F 83C001 add eax,$01 Unit50.pas.40: Inc(b, 2); 000000000059C532 83C202 add edx,$02 Unit50.pas.41: Inc(c, 3); 000000000059C535 4183C003 add r8d,$03 Unit50.pas.42: Inc(d, 4); 000000000059C539 4183C104 add r9d,$04 Unit50.pas.43: end; 000000000059C53D 4183C201 add r10d,$01 000000000059C541 4181FA40420F00 cmp r10d,$000f4240 000000000059C548 75E5 jnz $000000000059c52f 000000000059C54A 90 nop Unit50.pas.45: FA := a; 000000000059C54B 898150060000 mov [rcx+$00000650],eax Unit50.pas.46: FB := b; 000000000059C551 899154060000 mov [rcx+$00000654],edx Unit50.pas.47: FC := c; 000000000059C557 44898158060000 mov [rcx+$00000658],r8d Unit50.pas.48: FD := d; 000000000059C55E 4489895C060000 mov [rcx+$0000065c],r9d
The code looks much better now although the loop still funny (add + cmp). The thing is that in order to speed up ChessKISS in the 64 bits version, I've to remove as much as possible the use of class fields, which is not an easy task, let's see if I got the energy to do so and it yields the expected results.
New code formatter
I know I'm a bit lazy when we talk about polishing my blog entries, I will try to improve the quality of the HTML visualization of the code submitted, as an example I'm posting the full 2d collision unit, which BTW let's you test the collision of two bitmaps taking into account zoom and angle.
Thanks to SyntaxHighlighter http://alexgorbatchev.com/SyntaxHighlighter/manual/themes/
What do you think mr. stranger?
unit BB.Screen.Collisions; interface uses Windows, Classes, BB.Screen.Surfaces, BB.Screen.Interfaces, BB.E3D.Interfaces, BB.Types, BB.Screen.Types, BB.E3D.Polygons; type TBoundingBox = (bbFull, bbManual, bbAuto); TCollision = class private FCollisionShape: TCollisionShape; FCollider: ICollider; FBB, FSiluoette, FShape, FCurrentShape: IShape; FAccuracy: TCollisionAccuracy; FBoundingBox: TBoundingBox; function GetEdges: IShape; procedure CheckCollider; function GetAccuracy(aTotal: integer): integer; function GetRect(aShape: IShape): TRect; function GetShape(aShape: TCollisionShape): IShape; function GetBB: IShape; procedure SetCollisionShape(const aValue: TCollisionShape); function TestBB(aCollision: TCollision; out aRectA, aRectB, aOverlap: TRect): boolean; function TestColorKey(const aRectA, aRectB, aOverlap: TRect; aCollision: TCollision; out aX, aY: integer): boolean; procedure SetCurrentShapeAsBB; procedure SetCurrentShapeAsCustom; public constructor Create(aCollider: ICollider); destructor Destroy; override; function Test(aCollision: TCollision; out aX, aY: integer): boolean; procedure Reset; function GetCurrentShape: IShape; property Collider: ICollider read FCollider; property CollisionShape: TCollisionShape read FCollisionShape write SetCollisionShape; property Accuracy: TCollisionAccuracy read FAccuracy write FAccuracy; property BoundingBox: TBoundingBox read FBoundingBox write FBoundingBox; end; implementation uses SysUtils, {Types, }Generics.Collections, BB.Math.Matrix, BB.Math.Vector, BB.E3D.Vertex, BB.Math, BB.Interfaces, BB.Screen, BB.Colors, BB.Utils; { TCollision } constructor TCollision.Create(aCollider: ICollider); begin FCurrentShape := nil; FShape := nil; FBB := nil; FSiluoette := nil; FCollider := aCollider; FAccuracy := caLow; FBoundingBox := bbFull; CollisionShape := csRectangle; end; destructor TCollision.Destroy; begin FSiluoette := nil; FBB := nil; FShape := nil; inherited; end; function TCollision.GetEdges: IShape; begin Exit(IShape(FCollider.GetSurface.GetShape)); end; function TCollision.GetShape(aShape: TCollisionShape): IShape; var nx, ny, ScaledWidth, ScaledHeight, cx, cy, angle, x, y, z, zoom: TFloat; matrix: TMatrix; vertex, vector: TVector; accuracy, i, width, height: integer; surface: ISurface; //bb: TRect; begin FCollider.GetInfo(x, y, z, zoom, angle); surface := FCollider.GetSurface; //width := (bb.Right - bb.Left); //height := (bb.Bottom - bb.Top); width := surface.GetWidth; height := surface.GetHeight; nx := zoom / 100; ny := zoom / 100; case FBoundingBox of bbFull: begin ScaledWidth := width * nx / 2; ScaledHeight := height * ny / 2; end; bbManual, bbAuto: begin ScaledWidth := 0; ScaledHeight := 0; { TODO : bbAuto 0,0,31,31 } end; end; //cx := bb.Left + ScaledWidth; //cy := bb.Top + ScaledHeight; cx := x + ScaledWidth; cy := y + ScaledHeight; matrix := TMatrix.Create(0, 0, -angle); case aShape of csPrecise: begin if FSiluoette = nil then begin FSiluoette := GetEdges; if FSiluoette = nil then Exit(GetShape(csRectangle)); FShape := TPolygon.Create(FSiluoette.GetSides); end; for i := 0 to FSiluoette.GetSides - 1 do begin vector := FSiluoette.GetVertex(i).ClippedVertex; vector.X := vector.X - (Width / 2);//ScaledWidth; vector.Y := vector.Y - (Height / 2);//ScaledHeight; vertex.Rotate(vector, matrix); FShape.GetVertex(i).ClippedVertex := TVector.Create(cx + (vertex.X * nx), cy + (vertex.Y * ny), z); end; end; csRhombus: begin // 1 //4 2 // 3 //1 vertex.Rotate(TVector.Create(0, -ScaledHeight, 0), matrix); FShape.GetVertex(0).ClippedVertex := TVector.Create(cx + vertex.X, cy + vertex.Y, z); //2 vertex.Rotate(TVector.Create(ScaledWidth, 0, 0), matrix); FShape.GetVertex(1).ClippedVertex := TVector.Create(cx + vertex.X, cy + vertex.Y, z); //3 vertex.Rotate(TVector.Create(0, ScaledHeight, 0), matrix); FShape.GetVertex(2).ClippedVertex := TVector.Create(cx + vertex.X, cy + vertex.Y, z); //4 vertex.Rotate(TVector.Create(-ScaledWidth, 0, 0), matrix); FShape.GetVertex(3).ClippedVertex := TVector.Create(cx + vertex.X, cy + vertex.Y, z); end; csRectangle: begin FShape := FBB; end; csCircle: begin accuracy := 360 div GetAccuracy(360); for i := 0 to FShape.GetSides - 1 do begin vertex.Rotate(TVector.Create(ScaledWidth * Cos(i * accuracy * RAD), ScaledHeight * Sin(i * accuracy * RAD), 0), matrix); FShape.GetVertex(i).ClippedVertex := TVector.Create(cx + vertex.X, cy + vertex.Y, z); end; end; csTriangle: begin // 1 // //3 2 //1 vertex.Rotate(TVector.Create(0, -ScaledHeight, 0), matrix); FShape.GetVertex(0).ClippedVertex := TVector.Create(cx + vertex.X, cy + vertex.Y, z); //2 vertex.Rotate(TVector.Create(ScaledWidth, ScaledHeight, 0), matrix); FShape.GetVertex(1).ClippedVertex := TVector.Create(cx + vertex.X, cy + vertex.Y, z); //3 vertex.Rotate(TVector.Create(-ScaledWidth, ScaledHeight, 0), matrix); FShape.GetVertex(2).ClippedVertex := TVector.Create(cx + vertex.X, cy + vertex.Y, z); end; csColorKey: begin end; end; result := FShape; end; procedure TCollision.SetCollisionShape(const aValue: TCollisionShape); begin FCollisionShape := aValue; if FShape <> nil then begin FBB := nil; FShape := nil; Reset; end; //Always use Bounding Boxes FBB := TPolygon.Create(4); FCurrentShape := FBB; case FCollisionShape of csRectangle: begin FShape := FBB; end; csRhombus: begin FShape := TPolygon.Create(4); end; csCircle: begin FShape := TPolygon.Create(GetAccuracy(360)); end; csTriangle: begin FShape := TPolygon.Create(3); end; end; end; function TCollision.Test(aCollision: TCollision; out aX, aY: integer): boolean; var sa, sb: IShape; RectA, RectB, overlap: TRect; begin CheckCollider; aCollision.CheckCollider; aX := -1; aY := -1; if (CollisionShape = csNone) or (aCollision.CollisionShape = csNone) or (aCollision = self) then Exit(False); //Fast bounding boxes SetCurrentShapeAsBB; aCollision.SetCurrentShapeAsBB; if not TestBB(aCollision, RectA, RectB, overlap) then Exit(False); if (CollisionShape = csColorKey) and (aCollision.CollisionShape = csColorKey) then Exit(TestColorKey(RectA, RectB, overlap, aCollision, aX, aY)); //Custom collision test SetCurrentShapeAsCustom; aCollision.SetCurrentShapeAsCustom; sa := GetShape(FCollisionShape); sb := aCollision.GetShape(aCollision.FCollisionShape); if (sa = nil) or (sb = nil) then Exit(False); result := sa.Collision(sb, aX, aY) > -1; end; function TCollision.TestBB(aCollision: TCollision; out aRectA, aRectB, aOverlap: TRect): boolean; begin aRectA := GetRect(GetBB); aRectB := GetRect(aCollision.GetBB); result := IntersectRect(aOverlap, aRectA, aRectB); end; function TCollision.TestColorKey(const aRectA, aRectB, aOverlap: TRect; aCollision: TCollision; out aX, aY: integer): boolean; { TODO : -bpp -optimize } var xb, yb, xa, ya, i, j, width, height: integer; BufferB, BufferA: pointer; ColorKeyB, ColorKeyA: cardinal; SurfaceB, SurfaceA: ISurface; begin result := False; width := aOverlap.Right - aOverlap.Left - 1; height := aOverlap.Bottom - aOverlap.Top - 1; //Self if aRectA.Left < aOverlap.Left then xa := aOverlap.Left - aRectA.Left else xa := 0; if aRectA.Top < aOverlap.Top then ya := aOverlap.Top - aRectA.Top else ya := 0; //Other if aRectB.Left < aOverlap.Left then xb := aOverlap.Left - aRectB.Left else xb := 0; if aRectB.Top < aOverlap.Top then yb := aOverlap.Top - aRectB.Top else yb := 0; SurfaceB := aCollision.FCollider.GetSurface; SurfaceA := FCollider.GetSurface; SurfaceA.Lock([lfRead], nil); SurfaceB.Lock([lfRead], nil); try ColorKeyA := SurfaceA.GetColorKey; ColorKeyB := SurfaceB.GetColorKey; j := height; while j >= 0 do begin i := width; while i >= 0 do begin BufferA := SurfaceA.GetAddress(xa + i, ya + j); BufferB := SurfaceB.GetAddress(xb + i, yb + j); if (PCardinal(BufferA)^ <> ColorKeyA) and (PCardinal(BufferB)^ <> ColorKeyB) then begin aX := aOverlap.Left + i; aY := aOverlap.Top + j; Exit(True); end; Dec(i); end; Dec(j); end; finally SurfaceB.UnLock(nil); SurfaceA.Unlock(nil); end; end; function TCollision.GetAccuracy(aTotal: integer): integer; begin case FAccuracy of caLow: begin result := Round(aTotal * 0.15); end; caHigh: begin result := Round(aTotal * 0.65); end; caMax: begin result := aTotal; end else result := Round(aTotal * 0.40); end; end; function TCollision.GetBB: IShape; var nx, ny, ScaledWidth, ScaledHeight, cx, cy, angle, x, y, z, zoom: TFloat; matrix: TMatrix; vector: TVector; surface: ISurface; //bb: TRect; width, height: integer; begin FCollider.GetInfo(x, y, z, zoom, angle); surface := FCollider.GetSurface; //width := (bb.Right - bb.Left); //height := (bb.Bottom - bb.Top); width := surface.GetWidth; height := surface.GetHeight; nx := zoom / 100; ny := zoom / 100; ScaledWidth := width * nx / 2; ScaledHeight := height * ny / 2; //cx := bb.Left + ScaledWidth; //cy := bb.Top + ScaledHeight; cx := x + ScaledWidth; cy := y + ScaledHeight; matrix := TMatrix.Create(0, 0, -angle); //1 2 // //4 3 //1 vector.Rotate(TVector.Create(-ScaledWidth, -ScaledHeight, 0), matrix); FBB.GetVertex(0).ClippedVertex := TVector.Create(cx + vector.X, cy + vector.Y, z); //2 vector.Rotate(TVector.Create(ScaledWidth, -ScaledHeight, 0), matrix); FBB.GetVertex(1).ClippedVertex := TVector.Create(cx + vector.X, cy + vector.Y, z); //3 vector.Rotate(TVector.Create(ScaledWidth, ScaledHeight, 0), matrix); FBB.GetVertex(2).ClippedVertex := TVector.Create(cx + vector.X, cy + vector.Y, z); //4 vector.Rotate(TVector.Create(-ScaledWidth, ScaledHeight, 0), matrix); FBB.GetVertex(3).ClippedVertex := TVector.Create(cx + vector.X, cy + vector.Y, z); result := FBB; end; function TCollision.GetCurrentShape: IShape; begin Exit(FCurrentShape); end; procedure TCollision.SetCurrentShapeAsCustom; begin FCurrentShape := FShape; end; procedure TCollision.SetCurrentShapeAsBB; begin //Fast Bounding Boxes FCurrentShape := FBB; end; function TCollision.GetRect(aShape: IShape): TRect; var v: TVector; i: Integer; begin v := aShape.GetVertex(0).ClippedVertex; result.Left := Trunc(v.X); result.Right := result.Left; result.Top := Trunc(v.Y); result.Bottom := result.Top; for i := 1 to aShape.GetSides - 1 do begin v := aShape.GetVertex(i).ClippedVertex; if v.X < result.Left then result.Left := Trunc(v.X) else if v.X > result.Right then result.Right := Trunc(v.X); if v.Y < result.Top then result.Top := Trunc(v.Y) else if v.Y > result.Bottom then result.Bottom := Trunc(v.Y); end; end; procedure TCollision.Reset; begin FSiluoette := nil; end; procedure TCollision.CheckCollider; begin if FCollider = nil then raise Exception.Create('Collider not assigned'); end; end.
Thanks to SyntaxHighlighter http://alexgorbatchev.com/SyntaxHighlighter/manual/themes/
What do you think mr. stranger?
Sunday, September 18, 2011
x64 release
I've update the following links:
- ChessKISS binaries and sources to 1.3a
- BB Library to 1.2
- Demos
Now they all compile in the 64 bits version of Delphi, everything seems to work except the DirectX stuff which I guess has a problem with the headers (not yet resolved), also some code is being fixed in order to compile under MACOS.
BTW, the 64 bits version of ChessKISS is not faster than the 32 bits even if lot's of stuff is using int64 (see a previous post), well is even slower...
Thursday, September 15, 2011
ChessKISS 64 bits finally works as expected!
I've finally found the error that provokes a different behaviour between the 32 and 64 bits version of ChessKISS, see this highlighted code in EvalRook():
That is new and also was having an unknown behaviour in 32 bits in the trapped rook scenario:
//Check to see if the king has been forced to move and has trapped a rook at a1/b1/g1/h1, //if so, then penalize the trapped rook to help extricate it if FStage <= sMiddle then begin x := AllowedCols[col]; if (x > 0) and (aPiece.GetRank = 1) then begin //blocked up? if (FBoard.GetPiece(aPiece.Forward) <> nil) and //Blocked left or right (FBoard.GetPieceType(aPiece.Index + Deltas[x]) = Kings[me]) then Dec(pos, TRAPPED_ROOK_PENALTY); end;end;
That is new and also was having an unknown behaviour in 32 bits in the trapped rook scenario:
- Stage before ending
- Columns 1 or 2 or 7 or 8
- Next square occupied
- Left or right squares, occupied by own king
But this should only be checked in rank 1!!!, in the other ranks it does not make sense, but even worst on the last rank you can only get garbage since the piece is out of the board...
Well, I will upload a fix for this and also the brand new 64 bits version
Saturday, September 10, 2011
Orient, the demo that was never released
In case you want to know what was inside my brain 14 years ago...
(thankfully never released :-)
Old tools (really old)
I've added some old tools used for the old demos (compression, colors, textures, 3d files, etc)
Find it in the download section
Find it in the download section
Another old release
Every time I check the Legacy/ folder I found funny stuff, this time an intro for Micro Palma (an ancient BBS...)
It is strange to look at your old code, it can be very much crappy (being very polite...)
Anyway, I've added to the download section
It is strange to look at your old code, it can be very much crappy (being very polite...)
Anyway, I've added to the download section
Borland pascal 5.1, old libraries
I've realized that although I've released the source of the old demos, I haven't release the source of the libraries, thus, you won't be able to compile the projects, ok, I've fix this issue, now there is a new link under the label "Borland Pascal 5.1 library"
Enjoy?
Enjoy?
Friday, September 9, 2011
Trying Delphi XE2, testing Mac OSX
I'm trying the Mac stuff with no success, I've followed all instructions:
- Windows 7 IP is 192.168.2.2
- Install VirtualBox
- Install Mac OSX via VirtualBox
- Install in Mac the paserver thing from Embarcadero
- Launch pasever at port 64211
- VirtualBox guest IP is 192.168.2.69 forwared to 192.168.2.2 port 50000
Ping to remote OS works, but telnet on port 50000 doesn't (neither on port 64211), therefore in Delphi it does not work..., it shows: [PAClient Error] Error: EIdSocketError: Socket Error # 10061
The configuration in VirtualBox:
And the configuration in Delphi:
With Windows Firewall disabled it also does not work, so I'm really lost, somebody has been success with this kind of configuration?
Monday, September 5, 2011
Trying Delphi XE2, some performance tests, part 2
I had the feeling that the compiler does some strange optimizations somewhere since ChessKISS is slower in the 64 bits version, in order to prove that I've created the following test:
type TForm44 = class(TForm) procedure FormCreate(Sender: TObject); private FA, FB, FC, FD: NativeInt; { Private declarations } public { Public declarations } end; var Form44: TForm44; implementation {$R *.dfm} procedure TForm44.FormCreate(Sender: TObject); var a, b, c, d: NativeInt; begin a := 1; FA := a * 2; b := FA * 4; FB := b * 8; c := FB * 16; FC := c * 32; d := FC * 64; FD := d * 128; Caption := IntToStr(FD); end; end.
The optimized Win32 version:
Unit44.pas.31: a := 1; 0050FE73 B801000000 mov eax,$00000001 Unit44.pas.32: fa := a * 2; 0050FE78 8BD0 mov edx,eax 0050FE7A 03D2 add edx,edx 0050FE7C 899390030000 mov [ebx+$00000390],edx Unit44.pas.33: b := fa * 4; 0050FE82 8BC2 mov eax,edx 0050FE84 03C0 add eax,eax 0050FE86 03C0 add eax,eax Unit44.pas.34: fb := b * 8; 0050FE88 8BD0 mov edx,eax 0050FE8A 03D2 add edx,edx 0050FE8C 03D2 add edx,edx 0050FE8E 03D2 add edx,edx 0050FE90 899394030000 mov [ebx+$00000394],edx Unit44.pas.35: c := fb * 16; 0050FE96 8BC2 mov eax,edx 0050FE98 C1E004 shl eax,$04 Unit44.pas.36: fc := c * 32; 0050FE9B 8BD0 mov edx,eax 0050FE9D C1E205 shl edx,$05 0050FEA0 899398030000 mov [ebx+$00000398],edx Unit44.pas.37: d := fc * 64; 0050FEA6 8BC2 mov eax,edx 0050FEA8 C1E006 shl eax,$06 Unit44.pas.38: fd := d * 128; 0050FEAB 8BF0 mov esi,eax 0050FEAD C1E607 shl esi,$07 0050FEB0 89B39C030000 mov [ebx+$0000039c],esiA more or less good optimization, let's see the win64 version:Unit44.pas.31: a := 1; 000000000059C535 C7C001000000 mov eax,$00000001 Unit44.pas.32: FA := a * 2; 000000000059C53B 488D0400 lea rax,[rax+rax] 000000000059C53F 488B4D20 mov rcx,[rbp+$20] 000000000059C543 48898150060000 mov [rcx+$00000650],rax Unit44.pas.33: b := FA * 4; 000000000059C54A 488B8150060000 mov rax,[rcx+$00000650] 000000000059C551 4803C0 add rax,rax 000000000059C554 4803C0 add rax,rax Unit44.pas.34: FB := b * 8; 000000000059C557 488D04C500000000 lea rax,[rax*8+$0000] 000000000059C55F 488B4D20 mov rcx,[rbp+$20] 000000000059C563 48898158060000 mov [rcx+$00000658],rax Unit44.pas.35: c := FB * 16; 000000000059C56A 488B8158060000 mov rax,[rcx+$00000658] 000000000059C571 48C1E004 shl rax,$04 Unit44.pas.36: FC := c * 32; 000000000059C575 48C1E005 shl rax,$05 000000000059C579 488B4D20 mov rcx,[rbp+$20] 000000000059C57D 48898160060000 mov [rcx+$00000660],rax Unit44.pas.37: d := FC * 64; 000000000059C584 488B8160060000 mov rax,[rcx+$00000660] 000000000059C58B 48C1E006 shl rax,$06 Unit44.pas.38: FD := d * 128; 000000000059C58F 48C1E007 shl rax,$07 000000000059C593 488B4D20 mov rcx,[rbp+$20] 000000000059C597 48898168060000 mov [rcx+$00000668],raxAt a first glance it seems ok, but a closer look (check in blue) shows some strange results, Self accesses are not well optimized (I've noticed that also in ChessKISS), it keeps saving/reading the current value in rax register, looks like it could be easily kept on some register (and you have many in x64)
Labels:
Delphi,
Optimizations,
XE2
Location:
Palma, Spain
Saturday, September 3, 2011
Trying Delphi XE2, some performance tests....
Given the following program I've measure the speed with combinations of WIN32/64 and Debug/Release:
TEST 1)
TEST 1)
var i: integer; x, y: int64; begin x := 0; y := 0; for i := 0 to MaxLongInt -1 do begin Inc(x, y); Inc(y); end; end;
And the results expressed in milliseconds:
Debug | Release | |
Win32 | 6.703 | 5.395 |
Win64 | 4.005 | 1.380 |
Clearly the Win64 + Release is almost four times faster than the optimized 32 bits version, there is not big difference between non optimized versions of Win32 and Win64.
TEST 2)
Another test, this time implies multiplications:
var
i: integer;
x, y: int64;
begin
x := 0;
y := 0;
for i := 0 to MaxLongInt -1 do
begin
x := x * y;
Inc(y, i);
end;
end;
And again the results:
Debug | Release | |
Win32 | 14368 | 14461 |
Win64 | 5718 | 2398 |
The optimized Win64 version is seven times faster than optimized Win32, there is no difference between optimized and non optimized versions of Win32, the conclussion is that Delphi does not know how to optimize x64 assembler when in 32 bits mode.
But it seems that they are doing a good job with the optimizations of the 64 bits, let's see if I'm able to compare with other compilers.
Trying Delphi XE2, ChessKISS 64 bits vs 32 bits
Well, it was not difficult to make it work under 64 bits, but a bit to tune the WIN32 assembler versions that are not longer valid. I had to change the following functions:
function TBit.FirstBitSet(aValue: integer): integer; {$IFDEF WIN32} //EDX ASM BSF EAX, EDX END; {$ELSE} const Mod37BitPosition: array[0..36] of integer = ( 32, 0, 1, 26, 2, 23, 27, 0, 3, 16, 24, 30, 28, 11, 0, 13, 4, 7, 17, 0, 25, 22, 31, 15, 29, 10, 12, 6, 0, 21, 14, 9, 5, 20, 8, 19, 18 ); begin { result := 0; for i := 0 to 31 do begin if _GetBit(aValue, i) then Exit(i); end; } result := Mod37BitPosition[(-aValue and aValue) mod 37]; end; {$ENDIF}
function TBit.LastBitSet(aValue: integer): integer; {$IFDEF WIN32} //EDX ASM BSR EAX, EDX END; {$ELSE} const MultiplyDeBruijnBitPosition: array[0..31] of integer = (0, 1, 28, 2, 29, 14, 24, 3, 30, 22, 20, 15, 25, 17, 4, 8, 31, 27, 13, 23, 21, 19, 16, 7, 26, 12, 18, 6, 11, 5, 10, 9); begin { result := 0; for i := 31 downto 0 do begin if _GetBit(aValue, i) then Exit(i); end; } result := MultiplyDeBruijnBitPosition[cardinal(aValue * $077cb531) shr 27]; end; {$ENDIF}
function TBit.PopFirstBitSet(var aValue: integer): integer; //In = [EDX] Out = EAX {$IFDEF WIN32} ASM PUSH ECX MOV ECX, [EDX] BSF EAX, ECX BTR [EDX], EAX POP ECX END; {$ELSE} begin result := FirstBitSet(aValue); _Clear(aValue, result); end; {$ENDIF}
Coming from Delphi 2010 is a bit strange this build configuration, I've set it up to release, this has boost again the performance so is not longer two times slower, silly me...
Some tests from initial position.
Win32)
02/09/2011 23:52:29, Send: 1 77 0 57 Nb1c3 02/09/2011 23:52:29, Send: 2 5 0 228 Nb1c3 Nb8c6 02/09/2011 23:52:29, Send: 3 69 0 359 Nb1c3 Nb8c6 Ng1f3 02/09/2011 23:52:29, Send: 4 5 1 5309 Nb1c3 Nb8c6 Ng1f3 Ng8f6 02/09/2011 23:52:29, Send: 5 52 0 7137 Nb1c3 Nb8c6 Ng1f3 Ng8f6 d4 02/09/2011 23:52:29, Send: 6 39 1 12584 Nb1c3 Nb8c6 d3 Ng8f6 Bc1e3 d5 02/09/2011 23:52:29, Send: 7 56 4 30821 Nb1c3 d6 Ng1f3 Bc8e6 e4 Nb8c6 Bf1b5 02/09/2011 23:52:30, Send: 8 13 25 107775 Ng1f3 d5 e3 Bc8e6 Bf1b5 Nb8c6 Nf3e5 Ng8f6 Nb1c3 02/09/2011 23:52:30, Send: 9 9 35 220814 Ng1f3 d5 e3 Nb8c6 Bf1b5 Ng8f6 Nb1c3 Bc8g4 d4 Bg4xf3 02/09/2011 23:52:31, Send: 10 7 78 451966 Ng1f3 d5 e3 Nb8c6 Bf1b5 Ng8f6 Nb1c3 Bc8g4 d4 e6 02/09/2011 23:52:32, Send: 11 12 157 905577 Ng1f3 d5 e3 Nb8c6 Bf1b5 Ng8f6 Nb1c3 Bc8d7 d4 a6 Bb5d3 02/09/2011 23:52:39, Send: 12 16 650 2598910 e4 e5 Nb1c3 Nb8c6 Ng1f3 Ng8f6 Bf1b5 Bf8d6 d3 O-O Bc1g5 Kg8h8 02/09/2011 23:52:48, Send: 13 23 858 4698110 e4 Ng8f6 Nb1c3 d5 e4xd5 Nf6xd5 Ng1f3 Nb8c6 Bf1b5 Bc8g4 O-O a6 Bb5xc6 b7xc6 02/09/2011 23:52:58, Send: 14 21 1085 7551076 e4 Ng8f6 Nb1c3 d5 e4xd5 Nf6xd5 Ng1f3 Nb8c6 Bf1b5 Nd5xc3 b2xc3 Qd8d5 Ra1b1 Bc8d7Depth 14 reached in 32 seconds.Win64)03/09/2011 0:01:30, Send: 1 77 0 57 Nb1c303/09/2011 0:01:30, Send: 2 5 0 228 Nb1c3 Nb8c603/09/2011 0:01:30, Send: 3 69 0 359 Nb1c3 Nb8c6 Ng1f303/09/2011 0:01:30, Send: 4 5 1 5309 Nb1c3 Nb8c6 Ng1f3 Ng8f603/09/2011 0:01:30, Send: 5 52 0 7137 Nb1c3 Nb8c6 Ng1f3 Ng8f6 d403/09/2011 0:01:30, Send: 6 39 1 12584 Nb1c3 Nb8c6 d3 Ng8f6 Bc1e3 d503/09/2011 0:01:30, Send: 7 56 6 30821 Nb1c3 d6 Ng1f3 Bc8e6 e4 Nb8c6 Bf1b503/09/2011 0:01:31, Send: 8 13 32 107775 Ng1f3 d5 e3 Bc8e6 Bf1b5 Nb8c6 Nf3e5 Ng8f6 Nb1c303/09/2011 0:01:31, Send: 9 9 51 220817 Ng1f3 d5 e3 Nb8c6 Bf1b5 Ng8f6 Nb1c3 Bc8g4 d4 Bg4xf303/09/2011 0:01:32, Send: 10 7 106 451687 Ng1f3 d5 e3 Nb8c6 Bf1b5 Ng8f6 Nb1c3 Bc8g4 d4 e603/09/2011 0:01:34, Send: 11 12 215 908965 Ng1f3 d5 e3 Nb8c6 Bf1b5 Ng8f6 Nb1c3 Bc8d7 d4 a6 Bb5d303/09/2011 0:01:43, Send: 12 19 900 2607087 e4 e5 Nb1c3 Nb8c6 Ng1f3 Bf8d6 Bf1b5 Ng8e7 O-O a6 Bb5d3 Bd6c503/09/2011 0:01:53, Send: 13 23 931 4361400 e4 Ng8f6 Nb1c3 d5 e4xd5 Nf6xd5 Ng1f3 Nb8c6 Bf1b5 Bc8g4 O-O a6 Bb5xc6 b7xc603/09/2011 0:02:10, Send: 14 21 1780 7805174 e4 Ng8f6 Nb1c3 d5 e4xd5 Nf6xd5 Ng1f3 Nb8c6 Bf1b5 Nd5xc3 b2xc3 Qd8d5 Ra1b1 Bc8d7 Bb5d3Depth 14 reached in 40 seconds, 10 seconds slower!, but the worst thing is that they are different, OMG!02/09/2011 23:52:29, Send: 1 77 0 57 Nb1c3
03/09/2011 0:01:30, Send: 1 77 0 57 Nb1c302/09/2011 23:52:29, Send: 2 5 0 228 Nb1c3 Nb8c603/09/2011 0:01:30, Send: 2 5 0 228 Nb1c3 Nb8c602/09/2011 23:52:29, Send: 3 69 0 359 Nb1c3 Nb8c6 Ng1f303/09/2011 0:01:30, Send: 3 69 0 359 Nb1c3 Nb8c6 Ng1f302/09/2011 23:52:29, Send: 4 5 1 5309 Nb1c3 Nb8c6 Ng1f3 Ng8f603/09/2011 0:01:30, Send: 4 5 1 5309 Nb1c3 Nb8c6 Ng1f3 Ng8f602/09/2011 23:52:29, Send: 5 52 0 7137 Nb1c3 Nb8c6 Ng1f3 Ng8f6 d403/09/2011 0:01:30, Send: 5 52 0 7137 Nb1c3 Nb8c6 Ng1f3 Ng8f6 d402/09/2011 23:52:29, Send: 6 39 1 12584 Nb1c3 Nb8c6 d3 Ng8f6 Bc1e3 d503/09/2011 0:01:30, Send: 6 39 1 12584 Nb1c3 Nb8c6 d3 Ng8f6 Bc1e3 d502/09/2011 23:52:29, Send: 7 56 4 30821 Nb1c3 d6 Ng1f3 Bc8e6 e4 Nb8c6 Bf1b503/09/2011 0:01:30, Send: 7 56 6 30821 Nb1c3 d6 Ng1f3 Bc8e6 e4 Nb8c6 Bf1b502/09/2011 23:52:30, Send: 8 13 25 107775 Ng1f3 d5 e3 Bc8e6 Bf1b5 Nb8c6 Nf3e5 Ng8f6 Nb1c303/09/2011 0:01:31, Send: 8 13 32 107775 Ng1f3 d5 e3 Bc8e6 Bf1b5 Nb8c6 Nf3e5 Ng8f6 Nb1c302/09/2011 23:52:30, Send: 9 9 35 220814 Ng1f3 d5 e3 Nb8c6 Bf1b5 Ng8f6 Nb1c3 Bc8g4 d4 Bg4xf303/09/2011 0:01:31, Send: 9 9 51 220817 Ng1f3 d5 e3 Nb8c6 Bf1b5 Ng8f6 Nb1c3 Bc8g4 d4 Bg4xf302/09/2011 23:52:31, Send: 10 7 78 451966 Ng1f3 d5 e3 Nb8c6 Bf1b5 Ng8f6 Nb1c3 Bc8g4 d4 e6
03/09/2011 0:01:32, Send: 10 7 106 451687 Ng1f3 d5 e3 Nb8c6 Bf1b5 Ng8f6 Nb1c3 Bc8g4 d4 e6Different number of searches!02/09/2011 23:52:32, Send: 11 12 157 905577 Ng1f3 d5 e3 Nb8c6 Bf1b5 Ng8f6 Nb1c3 Bc8d7 d4 a6 Bb5d3
03/09/2011 0:01:34, Send: 11 12 215 908965 Ng1f3 d5 e3 Nb8c6 Bf1b5 Ng8f6 Nb1c3 Bc8d7 d4 a6 Bb5d3Different number of searches!02/09/2011 23:52:39, Send: 12 16 650 2598910 e4 e5 Nb1c3 Nb8c6 Ng1f3 Ng8f6 Bf1b5 Bf8d6 d3 O-O Bc1g5 Kg8h8
03/09/2011 0:01:43, Send: 12 19 900 2607087 e4 e5 Nb1c3 Nb8c6 Ng1f3 Bf8d6 Bf1b5 Ng8e7 O-O a6 Bb5d3 Bd6c502/09/2011 23:52:48, Send: 13 23 858 4698110 e4 Ng8f6 Nb1c3 d5 e4xd5 Nf6xd5 Ng1f3 Nb8c6 Bf1b5 Bc8g4 O-O a6 Bb5xc6 b7xc6
03/09/2011 0:01:53, Send: 13 23 931 4361400 e4 Ng8f6 Nb1c3 d5 e4xd5 Nf6xd5 Ng1f3 Nb8c6 Bf1b5 Bc8g4 O-O a6 Bb5xc6 b7xc602/09/2011 23:52:58, Send: 14 21 1085 7551076 e4 Ng8f6 Nb1c3 d5 e4xd5 Nf6xd5 Ng1f3 Nb8c6 Bf1b5 Nd5xc3 b2xc3 Qd8d5 Ra1b1 Bc8d7
03/09/2011 0:02:10, Send: 14 21 1780 7805174 e4 Ng8f6 Nb1c3 d5 e4xd5 Nf6xd5 Ng1f3 Nb8c6 Bf1b5 Nd5xc3 b2xc3 Qd8d5 Ra1b1 Bc8d7 Bb5d3Hooray!, more not so funny debugging sessions in order to find nasty bugs, let's see how I will tackle this...
Friday, September 2, 2011
Trying Delphi XE2, first impressions...
Yesterday I downladed the 30 days demo of the brand new Delphi XE2 with 64 bits and Mac support, I want to share my feelings.
I Don't like:
-No assembler, I had to create quite a few WIN32/64 defines and then create de Pascal versions of those heavily optimized versions.
-My DirectX stuff does not work, I've to find out why...
-64 bits generated assembler code seems to be terrible!, ChessKISS is two times slower in 64 bits when it should be faster since a lot of int64 are used, I will create a more specific post about this.
I like:
-99% of the sutff compiles
-Finds unused variables that Delphi 2010 does not.
-Win64 support
-Mac support (even if I cannot try it)
Considerations:
-Casting pointers, objects or interfaces as integer does not work, one must cast them into int64
Bugs?
-D3DXVector3(-1, 0, 1) returns "[DCC Error] BB.Screen.D3D.pas(895): E2029 ')' expected but ',' found", WTF?!, this function looks like this
-The following code does not compile, but I guess is something to do with the types of the windows function.
I Don't like:
-No assembler, I had to create quite a few WIN32/64 defines and then create de Pascal versions of those heavily optimized versions.
-My DirectX stuff does not work, I've to find out why...
-64 bits generated assembler code seems to be terrible!, ChessKISS is two times slower in 64 bits when it should be faster since a lot of int64 are used, I will create a more specific post about this.
I like:
-99% of the sutff compiles
-Finds unused variables that Delphi 2010 does not.
-Win64 support
-Mac support (even if I cannot try it)
Considerations:
-Casting pointers, objects or interfaces as integer does not work, one must cast them into int64
Bugs?
-D3DXVector3(-1, 0, 1) returns "[DCC Error] BB.Screen.D3D.pas(895): E2029 ')' expected but ',' found", WTF?!, this function looks like this
//-------------------------- // 3D Vector //-------------------------- function D3DXVector3(_x, _y, _z: Single): TD3DXVector3; inline; begin with Result do begin x:= _x; y:= _y; z:=_z; end; end;
-The following code gives the internal error "[DCC Fatal Error] BB.Sync.pas(1182): F2084 Internal Error: AV0A1B31D1-RFFFFFFFF-0"
constructor TFuture<T>.Create(aEvent: TNotification<T>; aPriority: TThreadPriority);
var
e: TNotification<T>;
begin
FResult := Default(T);
FError := '';
FWorker := TFutureThread<T>.Create(aEvent);
FWorker.Priority := aPriority;
FWorker.FreeOnTerminate := False;
end
with this workaround seems to at least compile...
constructor TFuture<T>.Create(aEvent: TNotification<T>; aPriority: TThreadPriority);
var
e: TNotification<T>;
begin
FResult := Default(T);
FError := '';
//Bloddy XE...
e := aEvent;
FWorker := TFutureThread<T>.Create(e);
FWorker.Priority := aPriority;
FWorker.FreeOnTerminate := False;
end;
-The following code does not compile, but I guess is something to do with the types of the windows function.
function GetCPUCount: integer; var ProcessMask: DWORD; SystemMask: DWORD; begin Result := 1; if GetProcessAffinityMask(GetCurrentProcess, ProcessMask, SystemMask) then begin while (ProcessMask <> 0) do begin if Odd(ProcessMask) then Inc(Result); ProcessMask := ProcessMask shr 1; end; Dec(Result); end; end;
[DCC Error] BB.Utils.pas(1031): E2033 Types of actual and formal var parameters must be identical.
Monday, August 22, 2011
Beware of the closures
The other day I was debugging ChessKISS looking for strange calls to system in order to increase performance and surprisingly I found too many calls to GetMem() , very strange I thought, I check the call stack and I end up in a very innocent function called ReceiveDataFromWinboard
function TChessEngine.ReceiveDataFromWinboard(out aMove: TMove): TMoveStatus; var cmd: TString; notification: TNotification<AnsiString>; begin aMove := NO_MOVE; result := msNone; if FWinboardTask = nil then begin notification := FWinboard.GetMessage; FWinboardTask := TFuture<AnsiString>.Create(notification, tpNormal); end; if FWinboardTask.Available then begin try cmd := TString(FWinboardTask.GetValue); if FWinboardTask.Error <> '' then begin FLog.Add(FWinboardTask.Error); raise FWinboardTask.GetException; end; if (not IsCommand(cmd)) and (FGameStatus = gsPlay) then result := TCommandHandler.Instance.ParseCommand(cmd, aMove); finally FreeAndNil(FWinboardTask); end; end; end;
Again nothing to be worried, a new thread is created every time the remote application sends a message, so I had to disassembler in order to see what was going on and voila!
004A0F38 55 push ebp 004A0F39 8BEC mov ebp,esp 004A0F3B 6A00 push $00 004A0F3D 53 push ebx 004A0F3E 56 push esi 004A0F3F 8BD8 mov ebx,eax 004A0F41 33C0 xor eax,eax 004A0F43 55 push ebp 004A0F44 68AF0F4A00 push $004a0faf 004A0F49 64FF30 push dword ptr fs:[eax] 004A0F4C 648920 mov fs:[eax],esp 004A0F4F B201 mov dl,$01 004A0F51 A1880E4A00 mov eax,[$004a0e88] 004A0F56 E8393EF6FF call TObject.Create 004A0F5B 8BF0 mov esi,eax 004A0F5D 8D45FC lea eax,[ebp-$04] 004A0F60 8BD6 mov edx,esi 004A0F62 85D2 test edx,edx 004A0F64 7403 jz $004a0f69 004A0F66 83EAF8 sub edx,-$08 004A0F69 E84E84F6FF call @IntfCopy
WTF!?, Delphi is automatically creating an object whose name is xxx$ActRec and then copying some information. Well, we have to isolated the problem and see if this still happening or it is a combination of closures with threads.procedure TForm39.OnTimer(Sender: TObject);
begin if FCount = 0 then begin FCount := 100; Call( function: integer begin result := Random(100); end ); end else Dec(FCount); end; procedure TForm39.Call(aTest: TTest); begin aTest(); end; procedure TForm39.FormCreate(Sender: TObject); begin FCount := 0; end
So here we go, in this example a Timer is calling a function which ONLY calls a closure once every 100 times, so I checked again the assembler code generated and I got very disappointed, Delphi automatically does his stuff every time there is a closure function, usually this is more than ok on a regular application, but on an application like ChesskISS which is heavily based on threads and the main loop is called millions of times per second this is not acceptable.
The solution was to add a new constructor in the TFuture<T> class adding a simple notificator as a parameter, so the code was clean again, sadly CheckKISS still slow, shame on me...
Big files
In a big project you might end up with tons of files, maybe you group them by directory, but it can be quite handy to group them all into a single file, for that purpose we have the TBigFile class, which contains the following methods:
TBigFile = class constructor Create; destructor Destroy; override; procedure AddStream(const aName: string; aStream: TStream); procedure AddFile(const aFileName: string); procedure BuildFrom(const aFileName: string); procedure Clear; function GetFiles: TStrings; overload; function GetFiles(const aPath, aMask: string): TStrings; overload; function GetDirectories: TStrings; function Load(const aPath, aFileName: string): TStream; overload; function Load(const aFileName: string): TStream; overload; procedure Save(const aFileName: string); property Signature: AnsiString read FSignature write FSignature;
end;
In the demos file, there is a project named BigFile.dpr which shows how to create and load a big file, let's make a summary:
Create the class
FBig := TBigFile.Create;Add some files
FBig.AddStream('test.txt', stream);
Since the parameter is a stream we can add any kind of resource, aditionally there is a method for directly add physical files called AddFile().
You can also specify different folders just add the folder name prior to the file name like 'one/test.txt'
Save the big file
FBig.Save('TheOne.big');
Load the big file
FBig.BuildFrom('TheOne.big');
Loading files
stream := FBig.Load('test.txt')
In the BB API, there are many load functions that support streams, so one can dothings likeent.LoadStream(FBig.Load('donut.asc'));
Getting files
strings := FBig.GetFiles;Now strings will contain all files located inside the big file, there is an overloaded method to retrieve files per directory and maskstrings := FBig.GetFiles('first', '*.bmp');These are the classes that support streams:
- TEnt
- TAnimations
- TSimpleSprite
- TLayer
- TIni
- TImageEx
- TSurface
Of course those streams can be compresses/decompressed, an easy way is to use the helper class TStreamHelper which holds several methods to help you:
TStreamHelper = class helper for TStream
public
function ToString: string;
procedure WriteString(const aString: string);
function Bof: boolean;
function Eof: boolean;
class function Compress(aStream: TStream): TStream;
class function Decompress(aStream: TStream): TStream;
class function StreamToString(aStream: TStream): string;
class function StringToStream(const aString: string): TStream;
class function MemoryToStream(aBuffer: pointer; aSize: int64): TStream;
class function StringToCompressed(const aString: string): TStream;
class function CompressToString(aStream: TStream): string;
class function ComponentToStream(aComponent: TComponent): TStream;
class function ComponentToString(aComponent: TComponent): string;
class procedure Save(const aFileName: string; aStream: TStream);
end;
Subscribe to:
Posts (Atom)